Todd Lipcon created IMPALA-7252:
-----------------------------------

             Summary: Backport rate limiting of fadvise calls into toolchain 
glog
                 Key: IMPALA-7252
                 URL: https://issues.apache.org/jira/browse/IMPALA-7252
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 3.0
            Reporter: Todd Lipcon


Currently, glog's default behavior is to call fadvise(FADV_DONTNEED) on the log 
file after each entry that is written. In many versions of the Linux kernel, 
each invocation of this call causes work to be scheduled on all other CPUs, 
causing up to one context switch per CPU for every log line. We saw this cause 
an extremely long GC pause in the catalogd in the case where the native side of 
the catalog was logging a lot of messages about publishing metadata updates at 
the same time that the Java side was running a GC. The GC spent almost all of 
its time in the kernel due to the high context switch rate causing a lot of TLB 
clears and misses, and instead of pausing the JVM for a couple of seconds took 
several minutes.

This was identified and fixed upstream in glog here: 
https://github.com/google/glog/commit/dacd29679633c9b845708e7015bd2c79367a6ea2

We should backport this fix into the version in the toolchain.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to