[jira] [Commented] (CASSANDRA-15006) Possible java.nio.DirectByteBuffer leak

JIRA Thu, 14 Feb 2019 07:13:43 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768386#comment-16768386
 ]


Jonas Borgström commented on CASSANDRA-15006:
---------------------------------------------

Ok, I just tested to truncate the largest table. This lowered the load on each 
node from about 6GiB to 0.5GiB.

I've attached a new screenshot that shows that this resulted in a dramatic 
reduction of the size of both the direct and mapped java.nio allocations.

It looks like long lived SSTables can "accumulate" more and more 
DirectByteBuffer allocations over time (in addition to their Chunk cache 
usage). These "accumulating" allocations are not freed until the corresponding 
SSTable file is unloaded (table truncation, compaction, etc).

Am I missing something?

 

 

> Possible java.nio.DirectByteBuffer leak
> ---------------------------------------
>
>                 Key: CASSANDRA-15006
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15006
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: cassandra: 3.11.3
> jre: openjdk version "1.8.0_181"
> heap size: 2GB
> memory limit: 3GB (cgroup)
> I started one of the nodes with "-Djdk.nio.maxCachedBufferSize=262144" but 
> that did not seem to make any difference.
>            Reporter: Jonas Borgström
>            Priority: Major
>         Attachments: CASSANDRA-15006-reference-chains.png, 
> Screenshot_2019-02-04 Grafana - Cassandra.png, Screenshot_2019-02-14 Grafana 
> - Cassandra(1).png, Screenshot_2019-02-14 Grafana - Cassandra.png
>
>
> While testing a 3 node 3.11.3 cluster I noticed that the nodes were suddenly 
> killed by the Linux OOM killer after running without issues for 4-5 weeks.
> After enabling more metrics and leaving the nodes running for 12 days it sure 
> looks like the
> "java.nio:type=BufferPool,name=direct" Mbean shows a very linear growth 
> (approx 15MiB/24h, see attached screenshot). Is this expected to keep growing 
> linearly after 12 days with a constant load?
>  
> In my setup the growth/leak is about 15MiB/day so I guess in most setups it 
> would take quite a few days until it becomes noticeable. I'm able to see the 
> same type of slow growth in other production clusters even though the graph 
> data is more noisy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15006) Possible java.nio.DirectByteBuffer leak

Reply via email to