Re: How to run the three producers test

2015-07-14 Thread JIEFU GONG
Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
your brokers fall out of the ISR when sending messages? It seems like your
setup should be fine, so I'm not entirely sure.

On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:

 Jiefu,

 I am performing these tests on a 6 nodes cluster in cloudlab (a
 infrastructure built for scientific research). I use 2 nodes as producers,
 2 as brokers only, and 2 as consumers. I have tested for each individual
 machines and they work well. I did not use AWS. Thank you!

 On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG jg...@berkeley.edu wrote:

  Yuheng, are you performing these tests locally or using a service such as
  AWS? I'd try using each separate machine individually first, connecting
 to
  the ZK/Kafka servers and ensuring that each is able to first log and
  consume messages independently.
 
  On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira gshap...@cloudera.com
  wrote:
 
   Are there any errors on the broker logs?
  
   On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du yuheng.du.h...@gmail.com
   wrote:
Jiefu,
   
Thank you. The three producers can run at the same time. I mean
 should
   they
be started at exactly the same time? (I have three consoles from each
  of
the three machines and I just start the console command manually one
 by
one) Or a small variation of the starting time won't matter?
   
Gwen and Jiefu,
   
I have started the three producers at three machines, after a while,
  all
   of
them gives a java.net.ConnectException:
   
[2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
192.168.1.1 (org.apache.kafka.common.network.Selector)
   
java.net.ConnectException: Connection refused..
   
[2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
192.168.1.2 (org.apache.kafka.common.network.Selector)
   
java.net.ConnectException: Connection refused.
   
What could be the cause?
   
Thank you guys!
   
   
   
   
On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG jg...@berkeley.edu
  wrote:
   
Yuheng,
   
Yes, if you read the blog post it specifies that he's using three
   separate
machines. There's no reason the producers cannot be started at the
  same
time, I believe.
   
On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du 
 yuheng.du.h...@gmail.com
  
wrote:
   
 Hi,

 I am running the performance test for kafka.
 https://gist.github.com/jkreps
 /c7ddb4041ef62a900e6c

 For the Three Producers, 3x async replication scenario, the
  command
   is
 the same as one producer:

 bin/kafka-run-class.sh
   org.apache.kafka.clients.tools.ProducerPerformance
 test 5000 100 -1 acks=1
 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
 buffer.memory=67108864 batch.size=8196

 So How to I run the test for three producers? Do I just run them
 on
   three
 separate servers at the same time? Will there be some error in
 this
   way
 since the three producers can't be started at the same time?

 Thanks.

 best,
 Yuheng

   
   
   
--
   
Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences
   
jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
   
  
 
 
 
  --
 
  Jiefu Gong
  University of California, Berkeley | Class of 2017
  B.A Computer Science | College of Letters and Sciences
 
  jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
 




-- 

Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences

jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427


Re: How to run the three producers test

2015-07-14 Thread JIEFU GONG
Someone may correct me if I am incorrect, but how much disk space do you
have on these nodes? Your exception 'No space left on device' from one of
your brokers seems to suggest that you're full (after all you're writing
500 million records). If this is the case I believe the expected behavior
for Kafka is to reject any more attempts to write data?

On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:

 Also, the log in another broker (not the bootstrap) says:

 [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
 writing to highwatermark file:  (kafka.server.ReplicaManager)


 [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 because
 of error (kafka.network.Process

 or)

 java.io.IOException: Connection reset by peer

 at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

 at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

 at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

 at sun.nio.ch.IOUtil.read(IOUtil.java:197)

 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)

 at kafka.utils.Utils$.read(Utils.scala:380)

 at

 kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)

 at kafka.network.Processor.read(SocketServer.scala:444)

 at kafka.network.Processor.run(SocketServer.scala:340)

 at java.lang.Thread.run(Thread.java:745)

 

 java.io.IOException: No space left on device

 at java.io.FileOutputStream.writeBytes(Native Method)

 at java.io.FileOutputStream.write(FileOutputStream.java:345)

 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

 at sun.nio.cs.StreamEncoder.implFlushBuffe

 (END)

 On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Hi Jiefu, Gwen,
 
  I am running the Throughput versus stored data test:
  bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
  test 500 100 -1 acks=1 bootstrap.servers=
  esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
 batch.size=8196
 
  After around 50,000,000 messages were sent, I got a bunch of connection
  refused error as I mentioned before. I checked the logs on the broker and
  here is what I see:
 
  [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
  checkpointed highwatermark is found for partition [test,4]
  (kafka.cluster.Partition)
 
  [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4
  ms. (kafka.log.Log)
 
  [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3
  ms. (kafka.log.Log)
 
  [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1
  ms. (kafka.log.Log)
 
  [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0
  ms. (kafka.log.Log)
 
  [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable
  I/O error while handling produce request:  (kafka.server.KafkaApis)
 
  kafka.common.KafkaStorageException: I/O exception in append to log
 'test-0'
 
  at kafka.log.Log.append(Log.scala:266)
 
  at
 
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)
 
  at
 
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)
 
  at kafka.utils.Utils$.inLock(Utils.scala:535)
 
  at kafka.utils.Utils$.inReadLock(Utils.scala:541)
 
  at
  kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365)
 
  at
 
 kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291)
 
  at
 
 kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282)
 
  at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
 
  at
 
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
 
  at
 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 
  at
 
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
 
  at scala.coll
 
 
 
  Can you help me with this problem? Thanks.
 
  

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
I checked the logs on the brokers, it seems that the zookeeper or the kafka
server process is not running on this broker...Thank you guys. I will see
if it happens again.

On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
 your brokers fall out of the ISR when sending messages? It seems like your
 setup should be fine, so I'm not entirely sure.

 On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Jiefu,
 
  I am performing these tests on a 6 nodes cluster in cloudlab (a
  infrastructure built for scientific research). I use 2 nodes as
 producers,
  2 as brokers only, and 2 as consumers. I have tested for each individual
  machines and they work well. I did not use AWS. Thank you!
 
  On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
   Yuheng, are you performing these tests locally or using a service such
 as
   AWS? I'd try using each separate machine individually first, connecting
  to
   the ZK/Kafka servers and ensuring that each is able to first log and
   consume messages independently.
  
   On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira gshap...@cloudera.com
   wrote:
  
Are there any errors on the broker logs?
   
On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du 
 yuheng.du.h...@gmail.com
wrote:
 Jiefu,

 Thank you. The three producers can run at the same time. I mean
  should
they
 be started at exactly the same time? (I have three consoles from
 each
   of
 the three machines and I just start the console command manually
 one
  by
 one) Or a small variation of the starting time won't matter?

 Gwen and Jiefu,

 I have started the three producers at three machines, after a
 while,
   all
of
 them gives a java.net.ConnectException:

 [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
 192.168.1.1 (org.apache.kafka.common.network.Selector)

 java.net.ConnectException: Connection refused..

 [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
 192.168.1.2 (org.apache.kafka.common.network.Selector)

 java.net.ConnectException: Connection refused.

 What could be the cause?

 Thank you guys!




 On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG jg...@berkeley.edu
   wrote:

 Yuheng,

 Yes, if you read the blog post it specifies that he's using three
separate
 machines. There's no reason the producers cannot be started at the
   same
 time, I believe.

 On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du 
  yuheng.du.h...@gmail.com
   
 wrote:

  Hi,
 
  I am running the performance test for kafka.
  https://gist.github.com/jkreps
  /c7ddb4041ef62a900e6c
 
  For the Three Producers, 3x async replication scenario, the
   command
is
  the same as one producer:
 
  bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance
  test 5000 100 -1 acks=1
  bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
  buffer.memory=67108864 batch.size=8196
 
  So How to I run the test for three producers? Do I just run them
  on
three
  separate servers at the same time? Will there be some error in
  this
way
  since the three producers can't be started at the same time?
 
  Thanks.
 
  best,
  Yuheng
 



 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427

   
  
  
  
   --
  
   Jiefu Gong
   University of California, Berkeley | Class of 2017
   B.A Computer Science | College of Letters and Sciences
  
   jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427
  
 



 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427



Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Also, the log in another broker (not the bootstrap) says:

[2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
writing to highwatermark file:  (kafka.server.ReplicaManager)


[2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 because
of error (kafka.network.Process

or)

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

at sun.nio.ch.IOUtil.read(IOUtil.java:197)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)

at kafka.utils.Utils$.read(Utils.scala:380)

at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)

at kafka.network.Processor.read(SocketServer.scala:444)

at kafka.network.Processor.run(SocketServer.scala:340)

at java.lang.Thread.run(Thread.java:745)



java.io.IOException: No space left on device

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:345)

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

at sun.nio.cs.StreamEncoder.implFlushBuffe

(END)

On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:

 Hi Jiefu, Gwen,

 I am running the Throughput versus stored data test:
 bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
 test 500 100 -1 acks=1 bootstrap.servers=
 esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196

 After around 50,000,000 messages were sent, I got a bunch of connection
 refused error as I mentioned before. I checked the logs on the broker and
 here is what I see:

 [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
 checkpointed highwatermark is found for partition [test,4]
 (kafka.cluster.Partition)

 [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4
 ms. (kafka.log.Log)

 [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3
 ms. (kafka.log.Log)

 [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1
 ms. (kafka.log.Log)

 [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0
 ms. (kafka.log.Log)

 [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable
 I/O error while handling produce request:  (kafka.server.KafkaApis)

 kafka.common.KafkaStorageException: I/O exception in append to log 'test-0'

 at kafka.log.Log.append(Log.scala:266)

 at
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)

 at
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)

 at kafka.utils.Utils$.inLock(Utils.scala:535)

 at kafka.utils.Utils$.inReadLock(Utils.scala:541)

 at
 kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365)

 at
 kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291)

 at
 kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282)

 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

 at
 scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

 at
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

 at
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

 at scala.coll



 Can you help me with this problem? Thanks.

 On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

 I checked the logs on the brokers, it seems that the zookeeper or the
 kafka server process is not running on this broker...Thank you guys. I will
 see if it happens again.

 On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
 your brokers fall out of the ISR when sending messages? It seems like
 your
 setup should be fine, so I'm not entirely sure.

 On Tue, Jul 14, 2015 at 1:31 PM, 

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Hi Jiefu, Gwen,

I am running the Throughput versus stored data test:
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test 500 100 -1 acks=1 bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196

After around 50,000,000 messages were sent, I got a bunch of connection
refused error as I mentioned before. I checked the logs on the broker and
here is what I see:

[2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
checkpointed highwatermark is found for partition [test,4]
(kafka.cluster.Partition)

[2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4 ms.
(kafka.log.Log)

[2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3 ms.
(kafka.log.Log)

[2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0 ms.
(kafka.log.Log)

[2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable
I/O error while handling produce request:  (kafka.server.KafkaApis)

kafka.common.KafkaStorageException: I/O exception in append to log 'test-0'

at kafka.log.Log.append(Log.scala:266)

at
kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)

at
kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)

at kafka.utils.Utils$.inLock(Utils.scala:535)

at kafka.utils.Utils$.inReadLock(Utils.scala:541)

at
kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365)

at
kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291)

at
kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282)

at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.coll



Can you help me with this problem? Thanks.

On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du yuheng.du.h...@gmail.com wrote:

 I checked the logs on the brokers, it seems that the zookeeper or the
 kafka server process is not running on this broker...Thank you guys. I will
 see if it happens again.

 On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
 your brokers fall out of the ISR when sending messages? It seems like your
 setup should be fine, so I'm not entirely sure.

 On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Jiefu,
 
  I am performing these tests on a 6 nodes cluster in cloudlab (a
  infrastructure built for scientific research). I use 2 nodes as
 producers,
  2 as brokers only, and 2 as consumers. I have tested for each individual
  machines and they work well. I did not use AWS. Thank you!
 
  On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG jg...@berkeley.edu wrote:
 
   Yuheng, are you performing these tests locally or using a service
 such as
   AWS? I'd try using each separate machine individually first,
 connecting
  to
   the ZK/Kafka servers and ensuring that each is able to first log and
   consume messages independently.
  
   On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira gshap...@cloudera.com
   wrote:
  
Are there any errors on the broker logs?
   
On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du 
 yuheng.du.h...@gmail.com
wrote:
 Jiefu,

 Thank you. The three producers can run at the same time. I mean
  should
they
 be started at exactly the same time? (I have three consoles from
 each
   of
 the three machines and I just start the console command manually
 one
  by
 one) Or a small variation of the starting time won't matter?

 Gwen and Jiefu,

 I have started the three producers at three machines, after a
 while,
   all
of
 them gives a java.net.ConnectException:

 [2015-07-14 12:56:46,352] WARN Error in I/O with 

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
But is there a way to let kafka override the old data if the disk is
filled? Or is it not necessary to use this figure? Thanks.

On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du yuheng.du.h...@gmail.com
wrote:

 Jiefu,

 I agree with you. I checked the hardware specs of my machines, each one of
 them has:

 RAM



 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

 Disk



 Two 1 TB 7.2K RPM 3G SATA HDDs

 For the throughput versus stored data test, it uses 5*10^10 messages,
 which has the total volume of 5TB, I made the replication factor to be 3,
 which means the total size including replicas would be 15TB, which
 apparently overwhelmed the two brokers I use.

 Thanks.

 best,
 Yuheng

 On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Someone may correct me if I am incorrect, but how much disk space do you
 have on these nodes? Your exception 'No space left on device' from one of
 your brokers seems to suggest that you're full (after all you're writing
 500 million records). If this is the case I believe the expected behavior
 for Kafka is to reject any more attempts to write data?

 On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Also, the log in another broker (not the bootstrap) says:
 
  [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
  writing to highwatermark file:  (kafka.server.ReplicaManager)
 
 
  [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
 because
  of error (kafka.network.Process
 
  or)
 
  java.io.IOException: Connection reset by peer
 
  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 
  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 
  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 
  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 
  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 
  at kafka.utils.Utils$.read(Utils.scala:380)
 
  at
 
 
 kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
 
  at kafka.network.Processor.read(SocketServer.scala:444)
 
  at kafka.network.Processor.run(SocketServer.scala:340)
 
  at java.lang.Thread.run(Thread.java:745)
 
  
 
  java.io.IOException: No space left on device
 
  at java.io.FileOutputStream.writeBytes(Native Method)
 
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
 
  at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 
  at sun.nio.cs.StreamEncoder.implFlushBuffe
 
  (END)
 
  On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com
  wrote:
 
   Hi Jiefu, Gwen,
  
   I am running the Throughput versus stored data test:
   bin/kafka-run-class.sh
 org.apache.kafka.clients.tools.ProducerPerformance
   test 500 100 -1 acks=1 bootstrap.servers=
   esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
  batch.size=8196
  
   After around 50,000,000 messages were sent, I got a bunch of
 connection
   refused error as I mentioned before. I checked the logs on the broker
 and
   here is what I see:
  
   [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
   checkpointed highwatermark is found for partition [test,4]
   (kafka.cluster.Partition)
  
   [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in
 4
   ms. (kafka.log.Log)
  
   [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in
 3
   ms. (kafka.log.Log)
  
   [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in
 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in
 0
   ms. (kafka.log.Log)
  
   [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to
 unrecoverable
   I/O error while handling produce request:  (kafka.server.KafkaApis)
  
   kafka.common.KafkaStorageException: I/O exception in append to log
  'test-0'
  
   at kafka.log.Log.append(Log.scala:266)
  
   at
  
 
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)
  
   at
  
 
 

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

I agree with you. I checked the hardware specs of my machines, each one of
them has:

RAM



256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

Disk



Two 1 TB 7.2K RPM 3G SATA HDDs

For the throughput versus stored data test, it uses 5*10^10 messages, which
has the total volume of 5TB, I made the replication factor to be 3, which
means the total size including replicas would be 15TB, which apparently
overwhelmed the two brokers I use.

Thanks.

best,
Yuheng

On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Someone may correct me if I am incorrect, but how much disk space do you
 have on these nodes? Your exception 'No space left on device' from one of
 your brokers seems to suggest that you're full (after all you're writing
 500 million records). If this is the case I believe the expected behavior
 for Kafka is to reject any more attempts to write data?

 On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Also, the log in another broker (not the bootstrap) says:
 
  [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
  writing to highwatermark file:  (kafka.server.ReplicaManager)
 
 
  [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
 because
  of error (kafka.network.Process
 
  or)
 
  java.io.IOException: Connection reset by peer
 
  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 
  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 
  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 
  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 
  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 
  at kafka.utils.Utils$.read(Utils.scala:380)
 
  at
 
 
 kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
 
  at kafka.network.Processor.read(SocketServer.scala:444)
 
  at kafka.network.Processor.run(SocketServer.scala:340)
 
  at java.lang.Thread.run(Thread.java:745)
 
  
 
  java.io.IOException: No space left on device
 
  at java.io.FileOutputStream.writeBytes(Native Method)
 
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
 
  at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 
  at sun.nio.cs.StreamEncoder.implFlushBuffe
 
  (END)
 
  On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com
  wrote:
 
   Hi Jiefu, Gwen,
  
   I am running the Throughput versus stored data test:
   bin/kafka-run-class.sh
 org.apache.kafka.clients.tools.ProducerPerformance
   test 500 100 -1 acks=1 bootstrap.servers=
   esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
  batch.size=8196
  
   After around 50,000,000 messages were sent, I got a bunch of connection
   refused error as I mentioned before. I checked the logs on the broker
 and
   here is what I see:
  
   [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
   checkpointed highwatermark is found for partition [test,4]
   (kafka.cluster.Partition)
  
   [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4
   ms. (kafka.log.Log)
  
   [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3
   ms. (kafka.log.Log)
  
   [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0
   ms. (kafka.log.Log)
  
   [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to
 unrecoverable
   I/O error while handling produce request:  (kafka.server.KafkaApis)
  
   kafka.common.KafkaStorageException: I/O exception in append to log
  'test-0'
  
   at kafka.log.Log.append(Log.scala:266)
  
   at
  
 
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)
  
   at
  
 
 kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)
  
   at kafka.utils.Utils$.inLock(Utils.scala:535)
  
   at kafka.utils.Utils$.inReadLock(Utils.scala:541)
  
   at
   

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

Now even if the disk space is enough (less than 18%), when I run

it still gives me error where in the logs it says:

[2015-07-14 23:08:48,735] FATAL Fatal error during KafkaServerStartable
startup. Prepare to shutdown (kafka.server.KafkaServerStartable)

org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
zookeeper server within timeout: 6000

at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)

at org.I0Itec.zkclient.ZkClient.init(ZkClient.java:98)

at org.I0Itec.zkclient.ZkClient.init(ZkClient.java:84)

at kafka.server.KafkaServer.initZk(KafkaServer.scala:157)

at kafka.server.KafkaServer.startup(KafkaServer.scala:82)

at
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)

at kafka.Kafka$.main(Kafka.scala:46)

at kafka.Kafka.main(Kafka.scala)

[2015-07-14 23:08:48,737] INFO [Kafka Server 1], shutting down
(kafka.server.KafkaServer)

I have checked that the zookeeper is running fine. Can anyone help why I
got the error? Thanks.

On Tue, Jul 14, 2015 at 10:24 PM, Yuheng Du yuheng.du.h...@gmail.com
wrote:

 But is there a way to let kafka override the old data if the disk is
 filled? Or is it not necessary to use this figure? Thanks.

 On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

 Jiefu,

 I agree with you. I checked the hardware specs of my machines, each one
 of them has:

 RAM



 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

 Disk



 Two 1 TB 7.2K RPM 3G SATA HDDs

 For the throughput versus stored data test, it uses 5*10^10 messages,
 which has the total volume of 5TB, I made the replication factor to be 3,
 which means the total size including replicas would be 15TB, which
 apparently overwhelmed the two brokers I use.

 Thanks.

 best,
 Yuheng

 On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Someone may correct me if I am incorrect, but how much disk space do you
 have on these nodes? Your exception 'No space left on device' from one of
 your brokers seems to suggest that you're full (after all you're writing
 500 million records). If this is the case I believe the expected behavior
 for Kafka is to reject any more attempts to write data?

 On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Also, the log in another broker (not the bootstrap) says:
 
  [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
  writing to highwatermark file:  (kafka.server.ReplicaManager)
 
 
  [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
 because
  of error (kafka.network.Process
 
  or)
 
  java.io.IOException: Connection reset by peer
 
  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
 
  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
 
  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
 
  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
 
  at
 sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
 
  at kafka.utils.Utils$.read(Utils.scala:380)
 
  at
 
 
 kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
 
  at kafka.network.Processor.read(SocketServer.scala:444)
 
  at kafka.network.Processor.run(SocketServer.scala:340)
 
  at java.lang.Thread.run(Thread.java:745)
 
  
 
  java.io.IOException: No space left on device
 
  at java.io.FileOutputStream.writeBytes(Native Method)
 
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
 
  at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 
  at sun.nio.cs.StreamEncoder.implFlushBuffe
 
  (END)
 
  On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com
  wrote:
 
   Hi Jiefu, Gwen,
  
   I am running the Throughput versus stored data test:
   bin/kafka-run-class.sh
 org.apache.kafka.clients.tools.ProducerPerformance
   test 500 100 -1 acks=1 bootstrap.servers=
   esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
  batch.size=8196
  
   After around 50,000,000 messages were sent, I got a bunch of
 connection
   refused error as I mentioned before. I checked the logs on the
 broker and
   here is what I see:
  
   [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
   checkpointed highwatermark is found for partition [test,4]
   (kafka.cluster.Partition)
  
   [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4'
 in 4
   ms. (kafka.log.Log)
  
   [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0'
 in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4'
 in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0'
 in 1
   ms. (kafka.log.Log)
  
   [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4'
 in 3
   ms. (kafka.log.Log)
  
   [2015-07-14 

Re: How to run the three producers test

2015-07-14 Thread JIEFU GONG
Yuheng,

Yes, if you read the blog post it specifies that he's using three separate
machines. There's no reason the producers cannot be started at the same
time, I believe.

On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du yuheng.du.h...@gmail.com
wrote:

 Hi,

 I am running the performance test for kafka.
 https://gist.github.com/jkreps
 /c7ddb4041ef62a900e6c

 For the Three Producers, 3x async replication scenario, the command is
 the same as one producer:

 bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
 test 5000 100 -1 acks=1
 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
 buffer.memory=67108864 batch.size=8196

 So How to I run the test for three producers? Do I just run them on three
 separate servers at the same time? Will there be some error in this way
 since the three producers can't be started at the same time?

 Thanks.

 best,
 Yuheng




-- 

Jiefu Gong
University of California, Berkeley | Class of 2017
B.A Computer Science | College of Letters and Sciences

jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427


Re: How to run the three producers test

2015-07-14 Thread Gwen Shapira
You need to run 3 of those at the same time. We don't expect any
errors, but if you run into anything, let us know and we'll try to
help.

Gwen

On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du yuheng.du.h...@gmail.com wrote:
 Hi,

 I am running the performance test for kafka. https://gist.github.com/jkreps
 /c7ddb4041ef62a900e6c

 For the Three Producers, 3x async replication scenario, the command is
 the same as one producer:

 bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
 test 5000 100 -1 acks=1
 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
 buffer.memory=67108864 batch.size=8196

 So How to I run the test for three producers? Do I just run them on three
 separate servers at the same time? Will there be some error in this way
 since the three producers can't be started at the same time?

 Thanks.

 best,
 Yuheng


Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

Thank you. The three producers can run at the same time. I mean should they
be started at exactly the same time? (I have three consoles from each of
the three machines and I just start the console command manually one by
one) Or a small variation of the starting time won't matter?

Gwen and Jiefu,

I have started the three producers at three machines, after a while, all of
them gives a java.net.ConnectException:

[2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
192.168.1.1 (org.apache.kafka.common.network.Selector)

java.net.ConnectException: Connection refused..

[2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
192.168.1.2 (org.apache.kafka.common.network.Selector)

java.net.ConnectException: Connection refused.

What could be the cause?

Thank you guys!




On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG jg...@berkeley.edu wrote:

 Yuheng,

 Yes, if you read the blog post it specifies that he's using three separate
 machines. There's no reason the producers cannot be started at the same
 time, I believe.

 On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du yuheng.du.h...@gmail.com
 wrote:

  Hi,
 
  I am running the performance test for kafka.
  https://gist.github.com/jkreps
  /c7ddb4041ef62a900e6c
 
  For the Three Producers, 3x async replication scenario, the command is
  the same as one producer:
 
  bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
  test 5000 100 -1 acks=1
  bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
  buffer.memory=67108864 batch.size=8196
 
  So How to I run the test for three producers? Do I just run them on three
  separate servers at the same time? Will there be some error in this way
  since the three producers can't be started at the same time?
 
  Thanks.
 
  best,
  Yuheng
 



 --

 Jiefu Gong
 University of California, Berkeley | Class of 2017
 B.A Computer Science | College of Letters and Sciences

 jg...@berkeley.edu elise...@berkeley.edu | (925) 400-3427