[
https://issues.apache.org/jira/browse/IMPALA-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon reassigned IMPALA-7252:
-----------------------------------
Assignee: Todd Lipcon
> Backport rate limiting of fadvise calls into toolchain glog
> -----------------------------------------------------------
>
> Key: IMPALA-7252
> URL: https://issues.apache.org/jira/browse/IMPALA-7252
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 3.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Major
>
> Currently, glog's default behavior is to call fadvise(FADV_DONTNEED) on the
> log file after each entry that is written. In many versions of the Linux
> kernel, each invocation of this call causes work to be scheduled on all other
> CPUs, causing up to one context switch per CPU for every log line. We saw
> this cause an extremely long GC pause in the catalogd in the case where the
> native side of the catalog was logging a lot of messages about publishing
> metadata updates at the same time that the Java side was running a GC. The GC
> spent almost all of its time in the kernel due to the high context switch
> rate causing a lot of TLB clears and misses, and instead of pausing the JVM
> for a couple of seconds took several minutes.
> This was identified and fixed upstream in glog here:
> https://github.com/google/glog/commit/dacd29679633c9b845708e7015bd2c79367a6ea2
> We should backport this fix into the version in the toolchain.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]