[
https://issues.apache.org/jira/browse/CASSANDRA-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ariel Weisberg updated CASSANDRA-12699:
---------------------------------------
Attachment: (was: cassandraMemoryLog.sh)
> Excessive use of "hidden" Linux page table memory
> -------------------------------------------------
>
> Key: CASSANDRA-12699
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12699
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: Cassandra 2.2.7 on Red Hat 6.7, with Java 1.8.0_73.
> Probably others.
> Reporter: Heiko Sommer
> Attachments: PageTableMemoryExample.png, cassandra-env.sh,
> cassandra.yaml, cassandraMemoryLog.sh
>
>
> The cassandra JVM process uses many gigabytes of page table memory during
> certain activities, which can lead to oom-killer action with
> "java.lang.OutOfMemoryError: null" logs.
> Page table memory is not reported by Linux tools such as "top" or "ps" and
> therefore might be responsible also for other spurious Cassandra issues with
> "memory eating" or crashes, e.g. CASSANDRA-8723.
> The problem happens especially (or only?) during large compactions and
> anticompactions.
> Eventually all memory gets released, which means there is no real leak. Still
> I suspect that the memory mappings that fill the page table could be released
> much sooner, to keep the page table size at a small fraction of the total
> Cassandra process memory.
> How to reproduce: Record the memory use on a Cassandra node, including page
> table memory, for example using the attached script cassandraMemoryLog.sh.
> Even when there is no crash, the ramping up and sudden release of page table
> memory is visible.
> A stacked area plot for the memory on one of our crashed nodes is attached
> (PageTableMemoryExample.png). The page table memory used by Cassandra is
> shown in red ("VmPTE").
> (In the plot we also see that the sum of measured memory portions sometimes
> exceeds the total memory. This is probably an issue of how RSS memory is
> measured, perhaps including some buffers/cache memory that also counts toward
> available memory. It does not invalidate the finding that page table memory
> is growing to enormous sizes.)
> Shortly before the crash, /proc/$PID/status reported
> VmPeak: 6989760944 kB
> VmSize: 5742400572 kB
> VmLck: 4735036 kB
> VmHWM: 8589972 kB
> VmRSS: 7022036 kB
> VmData: 10019732 kB
> VmStk: 92 kB
> VmExe: 4 kB
> VmLib: 17584 kB
> VmPTE: 3965856 kB
> VmSwap: 0 kB
> The files cassandra.yaml and cassandra-env.sh used on the node where the data
> was taken are attached.
> Please let me know if I should provide any other data or descriptions to help
> with this ticket.
> Known workarounds: Use more RAM, or limit the amount of Java heap memory. In
> the above crash, MAX_HEAP_SIZE was not set, so that the default heap size for
> 12 GB RAM was used (-Xms2976M, -Xmx2976M).
> We have not tried yet if variations of heap vs. offheap config choices make a
> difference.
> Perhaps there are other workarounds using -XX+UseLargePages or related Linux
> settings to reduce the size of the process page table?
> I believe that we see these crashes more often than other projects because we
> have a test system with not much RAM but with a lot of data (compressed ~3 TB
> per node), while the CPUs are slow so that anti-/compactions overlap a lot.
> Ideally Cassandra (native) code should be changed to release memory in
> smaller chunks, so that page table size cannot cause an otherwise stable
> system to crash.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)