[ 
https://issues.apache.org/jira/browse/CASSANDRA-13133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829391#comment-15829391
 ] 

Stefania commented on CASSANDRA-13133:
--------------------------------------

CASSANDRA-11594 fixed a leak in file descriptors for directories when 
calculating the snapshot size. TLDR is that whenever the snapshot size is 
queried with an ongoing transaction (e.g. sstable flushing or compaction) then 
the file descriptor would be leaked. I'm 99% sure this is the same problem. 

[~nrushton], the fix is in 3.10, which is due for release soon. If you want to 
apply the patch manually you find it 
[here|https://github.com/stef1927/cassandra/commit/7b940acc00fe1a907a40bd6b7dca7f2ea80fdddc].

> Unclosed file descriptors when querying SnapshotsSize metric
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-13133
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13133
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: CentOS 7
>            Reporter: NIcholas Rushton
>              Labels: jmx, lhf, metrics, newbie
>             Fix For: 3.9
>
>
> Started to notice many open file descriptors (100k+) per node, growing at a 
> rate of about 30 per minute in our cluster. After turning off our JMX 
> exporting server(https://github.com/prometheus/jmx_exporter), which gets 
> queried every 30 seconds, the number of file descriptors remained static. 
> Digging a bit further I ran a jmx dump tool over all the cassandra metrics 
> and tracked the number of file descriptors after each query, boiling it down 
> to a single metric causing the number of file descriptors to increase:
> org.apache.cassandra.metrics:keyspace=tpsv1,name=SnapshotsSize,scope=events_by_engagement_id,type=Table
> running a query a few times against this metric shows the file descriptors 
> increasing after each query:
> {code}
> for _ in {0..3} 
> do 
>    java -jar jmx-dump-0.4.2-standalone.jar --port 7199 --dump 
> org.apache.cassandra.metrics:keyspace=tpsv1,name=SnapshotsSize,scope=events_by_engagement_id,type=Table
>  > /dev/null; 
>    sudo lsof -p `pgrep -f CassandraDaemon` | fgrep "DIR" | awk 
> '{a[$(NF)]+=1}END{for(k in a){print k, a[k]}}' | grep "events_by" 
> done
> > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71
> >  33176
> > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71
> >  33177
> > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71
> >  33178
> > /data/cassandra/data/tpsv1/events_by_engagement_id-01d8f450a54911e6917ec93f8a91ec71
> >  33179
> {code}
> it should be noted that the file descriptor is open on a directory, not an 
> actual file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to