Hey guys, We're trying to deploy kafka 0.7 on EC2. According to a thread [1], he was getting 20,000 messages/sec on both EBS and local disk, at a message size of 1000. We have message sizes of 2K-6K, at a rate of 5,000 messages/sec and growing. So we ran some tests to see how kafka can handle this. My setup is a m1.large server running zookeeper and kafka server. Another m1.large server doing the perf tests.
For the producer test, I ran: bin/kafka-producer-perf-test.sh --async --batch-size 200 --brokerinfo zk.connect=[REDACTED] --compression-codec 0 --message-size 3000 --messages 5000000 --topic elben-perf-test-2 --vary-message-size And the results: https://gist.github.com/dc5e9cce497807d578d9 There are some weird results like this line: INFO thread 8: 495000 messages sent 14124.2938 nMsg/sec 19.9273 MBs/sec INFO thread 8: 500000 messages sent 21459.2275 nMsg/sec 30.8321 MBs/sec Any ideas what's happening here? Are the perf tests miscalculating the running average? But I think a correct conclusion is it produced 7496644565 bytes in 369 seconds, or roughly 20 MB/s. Running the producer with --compression-codec 1 (gzip), I get: bin/producer-perf-test.sh --async --batch-size 200 --brokerinfo zk.connect= kafka1.i.massrel.com --compression-codec 2 --message-size 3000 --messages 1000000 --topic elben-perf-test-3 --vary-message-size [0] 0:bash* INFO Total Num Messages: 1000000 bytes: 1500536347 in 126.447 secs (kafka.tools.ProducerPerformance$) INFO Messages/sec: 7908.4518 (kafka.tools.ProducerPerformance$) INFO MB/sec: 11.3172 (kafka.tools.ProducerPerformance$) For the consumer test, I ran: bin/kafka-consumer-perf-test.sh --props config/consumer.properties --topic elben-perf-test-2 --threads 10 With these results: https://gist.github.com/654093bd70571d21fb34 Again, there are weird things like why are the other threads consuming 0 MB/s and only thread 7 is doing 6.9 MB/s? Anyone else getting similar results? We need to consume at least 10 MB/s—I suppose it would be best to use partitions and use multiple consumers if we're seeing only 6 MB/s on a dump consumer with 10 threads each. Any suggestions or ideas? I've had lots of fun with Kafka and hope to be able to use it! Elben [1] http://mail-archives.apache.org/mod_mbox/incubator-kafka-users/201202.mbox/%3CCADWPM3jzgMZmc57HYb55PX=geat6d6wzbvowvrmem4dw3tt...@mail.gmail.com%3E