Hi Meg,

Thanks for the response.

The info about how consumer lag may impact mapped memory is extremely 
interesting.  I had read similar about lag control elsewhere but never even 
considered it could be a problem here.
I control both producers and consumers for all of the topics, so I will 
take notice of whether the problem occurs more when some consumers are 
lagging.

As @Avi suggested, it seems extremely likely that Kafka use of mmap is the 
cause of the safepoint delays.

Part of my problem is trying to prove it.. I cannot see anything on the 
stack traces of blocked threads (but sampling makes it unlikely* to catch)*
We are (probably unwisely) using ext4 + raid - this might amplify any mmap 
problems.

We are running with 64G memory, and 4G allocated to Kafka heap.
It looks like Kafka has consumed most of the memory for cache.

              total        used        free      shared  buff/cache   
available
Mem:       65849884     5526480      395740        8644    59927664    
59640260
Swap:       1999868           0     1999868

%Cpu(s):  6.8 us,  0.4 sy,  0.0 ni, 91.6 id,  1.1 wa,  0.0 hi,  0.1 si,  
0.0 st
KiB Mem : 65849884 total,   351272 free,  5523400 used, 59975212 buff/cache
KiB Swap:  1999868 total,  1999868 free,        0 used. 59642964 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
12805 kafka     20   0 61.843g 4.523g  59444 S 312.5  7.2   3704:57 java


Thanks,
Ross

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to