[ 
https://issues.apache.org/jira/browse/CASSANDRA-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768374#comment-16768374
 ] 

Benedict edited comment on CASSANDRA-15006 at 2/14/19 3:09 PM:
---------------------------------------------------------------

Hi [~jborgstrom],

The {{DirectByteBufferR}} simply means it is a read only byte buffer.  This 
might be mapped.  Unfortunately, given conflation of terms around 
{{BufferPool}} it is hard to understand what your graphs mean.  Could you 
explicitly define them for me?  What does each graph title directly map to; how 
is it being produced?

I would not bother truncating any table.  Ideally, we would get a heap dump 
posted somewhere privately for us to download and analyse.

We also really need to understand the memory environment of the node; you 
indicate you have limited the process to 3GiB by cgroups, but we can see much 
more than this committed to the process.

If you could please post a full log file from the node as well, so we can see 
at least what configuration settings it is starting with, as it may be that 
this is all completely acceptable.  There is still insufficient information to 
say for sure there is a leak, instead of simply incremental growth within the 
defined configuration bounds.


was (Author: benedict):
Hi [~jborgstrom],

The {{DirectByteBufferR}} simply means it is a read only byte buffer.  This 
might be mapped.  Unfortunately, given conflation of terms around 
{{BufferPool}} it is hard to understand what your graphs mean.  Could you 
explicitly define them for me?  What does each graph title directly map to; how 
is it being produced?

I would not bother truncating any table.  Ideally, we would get a heap dump 
posted somewhere privately for us to download and analyse.

> Possible java.nio.DirectByteBuffer leak
> ---------------------------------------
>
>                 Key: CASSANDRA-15006
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15006
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: cassandra: 3.11.3
> jre: openjdk version "1.8.0_181"
> heap size: 2GB
> memory limit: 3GB (cgroup)
> I started one of the nodes with "-Djdk.nio.maxCachedBufferSize=262144" but 
> that did not seem to make any difference.
>            Reporter: Jonas Borgström
>            Priority: Major
>         Attachments: CASSANDRA-15006-reference-chains.png, 
> Screenshot_2019-02-04 Grafana - Cassandra.png, Screenshot_2019-02-14 Grafana 
> - Cassandra(1).png, Screenshot_2019-02-14 Grafana - Cassandra.png
>
>
> While testing a 3 node 3.11.3 cluster I noticed that the nodes were suddenly 
> killed by the Linux OOM killer after running without issues for 4-5 weeks.
> After enabling more metrics and leaving the nodes running for 12 days it sure 
> looks like the
> "java.nio:type=BufferPool,name=direct" Mbean shows a very linear growth 
> (approx 15MiB/24h, see attached screenshot). Is this expected to keep growing 
> linearly after 12 days with a constant load?
>  
> In my setup the growth/leak is about 15MiB/day so I guess in most setups it 
> would take quite a few days until it becomes noticeable. I'm able to see the 
> same type of slow growth in other production clusters even though the graph 
> data is more noisy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to