Hi Meg,
Thanks for the response.
The info about how consumer lag may impact mapped memory is extremely
interesting. I had read similar about lag control elsewhere but never even
considered it could be a problem here.
I control both producers and consumers for all of the topics, so I will
take notice of whether the problem occurs more when some consumers are
lagging.
As @Avi suggested, it seems extremely likely that Kafka use of mmap is the
cause of the safepoint delays.
Part of my problem is trying to prove it.. I cannot see anything on the
stack traces of blocked threads (but sampling makes it unlikely* to catch)*
We are (probably unwisely) using ext4 + raid - this might amplify any mmap
problems.
We are running with 64G memory, and 4G allocated to Kafka heap.
It looks like Kafka has consumed most of the memory for cache.
total used free shared buff/cache
available
Mem: 65849884 5526480 395740 8644 59927664
59640260
Swap: 1999868 0 1999868
%Cpu(s): 6.8 us, 0.4 sy, 0.0 ni, 91.6 id, 1.1 wa, 0.0 hi, 0.1 si,
0.0 st
KiB Mem : 65849884 total, 351272 free, 5523400 used, 59975212 buff/cache
KiB Swap: 1999868 total, 1999868 free, 0 used. 59642964 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12805 kafka 20 0 61.843g 4.523g 59444 S 312.5 7.2 3704:57 java
Thanks,
Ross
--
You received this message because you are subscribed to the Google Groups
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.