I implemented (nearly) the same basic set of tests in the system test
framework we started at Confluent and that is going to move into Kafka --
see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70
In particular, that test is implemented in benchmark_test.py:
https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8

Hopefully once that's merged people can reuse that benchmark (and add to
it!) so they can easily run the same benchmarks across different hardware.
Here are some results from an older version of that test on m3.2xlarge
instances on EC2 using local ephemeral storage (I think... it's been awhile
since I ran these numbers and I didn't document methodology that carefully):

INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:BENCHMARK RESULTS
INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208
rec/sec (65.240000 MB/s)
INFO:_.KafkaBenchmark:Single producer, async 3x replication:
667494.359673 rec/sec (63.660000 MB/s)
INFO:_.KafkaBenchmark:Single producer, sync 3x replication:
116485.764275 rec/sec (11.110000 MB/s)
INFO:_.KafkaBenchmark:Three producers, async 3x replication:
1696519.022182 rec/sec (161.790000 MB/s)
INFO:_.KafkaBenchmark:Message size:
INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.620000 MB/s)
INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.750000 MB/s)
INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.170000 MB/s)
INFO:_.KafkaBenchmark: 10000: 8306.180862 rec/sec (79.210000 MB/s)
INFO:_.KafkaBenchmark: 100000: 978.403499 rec/sec (93.310000 MB/s)
INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.300000 MB/s)
INFO:_.KafkaBenchmark:Single consumer: 701031.140000 rec/sec (56.830500 MB/s)
INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s)
INFO:_.KafkaBenchmark:Producer + consumer:
INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark:End-to-end latency: median 2.000000 ms, 99%
4.000000 ms, 99.9% 19.000000 ms

Don't trust these numbers for anything, the were a quick one-off test. I'm
just pasting the output so you get some idea of what the results might look
like. Once we merge the KIP-25 patch, Confluent will be running the tests
regularly and results will be available publicly so we'll be able to keep
better tabs on performance, albeit for only a specific class of hardware.

For the batch.size question -- I'm not sure the results in the blog post
actually have different settings, it could be accidental divergence between
the script and the blog post. The post specifically notes that tuning the
batch size in the synchronous case might help, but that he didn't do that.
If you're trying to benchmark the *optimal* throughput, tuning the batch
size would make sense. Since synchronous replication will have higher
latency and there's a limit to how many requests can be in flight at once,
you'll want a larger batch size to compensate for the additional latency.
However, in practice the increase you see may be negligible. Somebody who
has spent more time fiddling with tweaking producer performance may have
more insight.

-Ewen

On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG <jg...@berkeley.edu> wrote:

> Hi all,
>
> I was wondering if any of you guys have done benchmarks on Kafka
> performance before, and if they or their details (# nodes in cluster, #
> records / size(s) of messages, etc.) could be shared.
>
> For comparison purposes, I am trying to benchmark Kafka against some
> similar services such as Kinesis or Scribe. Additionally, I was wondering
> if anyone could shed some insight on Jay Kreps' benchmarks that he has
> openly published here:
>
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
>
> Specifically, I am unsure of why between his tests of 3x synchronous
> replication and 3x async replication he changed the batch.size, as well as
> why he is seemingly publishing to incorrect topics:
>
> Configs:
> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
>
> Any help is greatly appreciated!
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu <elise...@berkeley.edu> | (925) 400-3427
>



-- 
Thanks,
Ewen

Reply via email to