[
https://issues.apache.org/jira/browse/CASSANDRA-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-6191:
--------------------------------------
Priority: Minor (was: Major)
Issue Type: Improvement (was: Bug)
Summary: Add a warning for small sstable size setting in LCS (was:
Memory exhaustion with large number of compressed SSTables)
> Add a warning for small sstable size setting in LCS
> ---------------------------------------------------
>
> Key: CASSANDRA-6191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6191
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: OS: Debian 7.1
> Java: Oracle 1.7.0_25
> Cassandra: 1.2.10
> Memory: 24GB
> Heap: 8GB
> Reporter: Tomas Salfischberger
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 1.2.12, 2.0.2
>
> Attachments: 6191.txt
>
>
> Not sure "bug" is the right description, because I can't say for sure that
> the large number of SSTables is the cause of the memory issues. I'll share my
> research so far:
> Under high read-load with a very large number of compressed SSTables (caused
> by the initial default 5mb sstable_size in LCS) it seems memory is exhausted,
> without any room for GC to fix this. It tries to GC but doesn't reclaim much.
> The node first hits the "emergency valves" flushing all memtables, then
> reducing caches. And finally logs 0.99+ heap usages and hangs with GC failure
> or crashes with OutOfMemoryError.
> I've taken a heapdump and started analysis to find out what's wrong. The
> memory seems to be used by the byte[] backing the HeapByteBuffer in the
> "compressed" field of
> org.apache.cassandra.io.compress.CompressedRandomAccessReader. The byte[] are
> generally 65536 byes in size, matching the block-size of the compression.
> Looking further in the heap-dump I can see that these readers are part of the
> pool in org.apache.cassandra.io.util.CompressedPoolingSegmentedFile. Which is
> linked to the "dfile" field of org.apache.cassandra.io.sstable.SSTableReader.
> The dump-file lists 45248 instances of CompressedRandomAccessReader.
> Is this intended to go this way? Is there a leak somewhere? Or should there
> be an alternative strategy and/or warning for cases where a node is trying to
> read far too many SSTables?
> EDIT:
> Searching through the code I found that PoolingSegmentedFile keeps a pool of
> RandomAccessReader for re-use. While the CompressedRandomAccessReader
> allocates a ByteBuffer in it's constructor and (to make things worse)
> enlarges it if it's reasing a large chunk. This (sometimes enlarged)
> ByteBuffer is then kept alive because it becomes part of the
> CompressedRandomAccessReader which is in turn kept alive as part of the pool
> in the PoolingSegmentedFile.
--
This message was sent by Atlassian JIRA
(v6.1#6144)