Sounds good that you got up to 500MB/s. At that point I suspect you reach a 
sort of steady state where the cache is continuously flushing to the SSDs, so 
you are effectively bottlenecked by the SSD. I believe this is as expected (the 
bottleneck resource will dominate the end to end throughput even if you have a 
memory buffer to temporarily hold produce records).

If you want to be completely in-memory, that's ok for experimental purposes 
(e.g, using a RAM-based file system) but you'll then have to start worrying 
about failures if you are in production. However, if you are interested in just 
exploring perf for now, using an in-memory file system will be fine. You could 
even try having 3-way replication, all in memory (with 3 brokers). Again, note 
of caution: a correlated power failure can lose all your replicated in-memory 
data, so in production even if you replicate in memory you'd still want to 
write somewhere persistent. Unless your memory is battery-backed, but that's a 
longer story.

Cheers
Eno

> On 11 Oct 2016, at 03:49, Christopher Stelly <cdste...@gmail.com> wrote:
> 
> With that link I came across the producer-perf-test tool, quite useful as
> it gets rid of the Go (Sarama) variable. Since it can quickly tweak
> settings, it's extremely useful.
> 
> As you suggested Eno, I attempted to copy the LinkedIn settings. With 100
> byte records, I get up to about 600,000 records/second. Still not quite
> what they were able to do with cheap hardware.
> 
> As far as throughput, I tend to max out at around 200MB/s on a single
> producer (record size 10000, no acks, linger 100, batch size 1000000 and
> with compression), along with a generous HEAP_OPT env setting of 50G.
> 
> If I do the above settings with 4 producers, things start to slow down. It
> seems to add up to around 500MB/s, which is about what the SSD can write
> at.
> 
> Could this number be improved if I let the memory take care of this instead
> of flushing to disk? I understand that Kafka likes to flush often, but even
> relaxing broker's flush settings I can't seem to make an impact on this
> 500MB/s number (w/ 4 producers). After all, we have a lot of RAM to play
> with. The hackish solution would be to make a tempfs mount and store
> kafka-logs there, but that seems like the wrong approach. Any thoughts? Do
> you think flushing is my hold up at this point?
> 
> Again, thanks!
> 
> On Mon, Oct 10, 2016 at 12:45 PM, Christopher Stelly <cdste...@gmail.com>
> wrote:
> 
>> Sure, good ideas. I'll try multiple producers, localhost and LAN, to see
>> if any difference
>> 
>> Yep, Gwen, the Sarama client. Anything to worry about there outside of
>> setting the producer configs (which would you set?) and number of buffered
>> channels? (currently, buffered channels up to 10k).
>> 
>> Thanks!
>> 
>> On Mon, Oct 10, 2016 at 12:04 PM, Gwen Shapira <g...@confluent.io> wrote:
>> 
>>> Out of curiosity - what is "Golang's Kafka interface"? Are you
>>> referring to Sarama client?
>>> 
>>> On Sun, Oct 9, 2016 at 9:28 AM, Christopher Stelly <cdste...@gmail.com>
>>> wrote:
>>>> Hello,
>>>> 
>>>> The last thread available regarding 10GBe is about 2 years old, with no
>>>> obvious recommendations on tuning.
>>>> 
>>>> Is there a more complex tuning guide than the example production config
>>>> available on Kafka's main site? Anything other than the list of possible
>>>> configs?
>>>> 
>>>> I currently have access to a rather substantial academic cluster to test
>>>> on, including multiple machines with the following hardware:
>>>> 
>>>> 10GBe NICs
>>>> 250GB RAM each
>>>> SSDs on each
>>>> (also, optional access to single NVMe)
>>>> 
>>>> Using Golang's Kafka interface, I can only seem to get about 80MB/s on
>>> the
>>>> producer pushing to logs on the localhost, using no replication and
>>> reading
>>>> from/logging to SSD. If it helps, I can post my configs. I've tried
>>>> fiddling with a bunch of broker configs as well as producer configs,
>>>> raising the memory limits, max message size, io&network threads etc.
>>>> 
>>>> Since the last post from 2014 indicates that there is no public
>>>> benchmarking for 10GBe, I'd be happy to run benchmarks /publish results
>>> on
>>>> this hardware if we can get it tuned up properly.
>>>> 
>>>> What kind of broker/producer/consumer settings would you recommend?
>>>> 
>>>> Thanks!
>>>> - chris
>>> 
>>> 
>>> 
>>> --
>>> Gwen Shapira
>>> Product Manager | Confluent
>>> 650.450.2760 | @gwenshap
>>> Follow us: Twitter | blog
>>> 
>> 
>> 

Reply via email to