I suspect this is on Linux right?

The way Linux works is it uses a percent of memory to buffer new writes, at
a certain point it thinks it has too much buffered data and it gives high
priority to writing that out. The good news about this is that the writes
are very linear, well layed out, and high-throughput. The problem is that
it leads to a bit of see-saw behavior.

Now obviously the drop in performance isn't wrong. When your disk is
writing data out it is doing work and obviously the read throughput will be
higher when you are just reading and not writing then when you are doing
both reading and writing simultaneously. So obviously you can't get the
no-writing performance when you are also writing (unless you add I/O
capacity).

But still these big see-saws in performance are not ideal. You'd rather
have more constant performance all the time rather than have linux bounce
back and forth from writing nothing and then frantically writing full bore.
Fortunately linux provides a set of pagecache tuning parameters that let
you control this a bit.

I think these docs cover some of the parameters:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-tunables.html

-Jay

On Thu, Jul 20, 2017 at 10:24 AM, Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr> wrote:

> Hi guys,
>
> I’m relatively new to Kafka’s world. I have an issue I describe below,
> maybe you can help me understand this behaviour.
>
> I’m running a benchmark using the following setup: one producer sends data
> to a topic and concurrently one consumer pulls and writes it to another
> topic.
> Measuring the consumer throughput, I observe values around 500K records/s
> only until the system’s cache gets filled - from this moment the consumer
> throughout drops to ~200K (2.5 times lower).
> Looking at disk usage, I observe disk read I/O which corresponds to the
> moment the consumer throughout drops.
> After some time, I kill the producer and immediately I observe the
> consumer throughput goes up to initial values ~ 500K records/s.
>
> What can I do to avoid this throughput drop?
>
> Attached an image showing disk I/O and CPU usage. I have about 128GB RAM
> on that server which gets filled at time ~2300.
>
> Thanks,
> Ovidiu
>
>

Reply via email to