[
https://issues.apache.org/jira/browse/CASSANDRA-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827322#comment-13827322
]
graham sanderson commented on CASSANDRA-6275:
---------------------------------------------
Also note stack trace for all leaked files we saw - someone can perhaps use
this to help figure out what this actually affects (i.e. some of the iter's
RARs may have been owned by someone else in which case AOK)
{code}
ERROR [Finalizer] 2013-11-20 03:43:42,129 RandomAccessReader.java (line 399)
LEAK finalizer had to clean up
java.lang.Exception: RAR for
/data/5/cassandra/OpsCenter/rollups60/OpsCenter-rollups60-jb-6882-Data.db
allocated
at
org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:66)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:76)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:43)
at
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.createReader(CompressedPoolingSegmentedFile.java:48)
at
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:39)
at
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1182)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.setToRowStart(IndexedSliceReader.java:108)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.<init>(IndexedSliceReader.java:84)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:42)
at
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:167)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:273)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1467)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1286)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}
> 2.0.x leaks file handles
> ------------------------
>
> Key: CASSANDRA-6275
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6275
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Environment: java version "1.7.0_25"
> Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)
> Linux cassandra-test1 2.6.32-279.el6.x86_64 #1 SMP Thu Jun 21 15:00:18 EDT
> 2012 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: Mikhail Mazursky
> Assignee: Michael Shuler
> Attachments: c_file-descriptors_strace.tbz, cassandra_jstack.txt,
> leak.log, position_hints.tgz, slog.gz
>
>
> Looks like C* is leaking file descriptors when doing lots of CAS operations.
> {noformat}
> $ sudo cat /proc/15455/limits
> Limit Soft Limit Hard Limit Units
> Max cpu time unlimited unlimited seconds
> Max file size unlimited unlimited bytes
> Max data size unlimited unlimited bytes
> Max stack size 10485760 unlimited bytes
> Max core file size 0 0 bytes
> Max resident set unlimited unlimited bytes
> Max processes 1024 unlimited processes
> Max open files 4096 4096 files
> Max locked memory unlimited unlimited bytes
> Max address space unlimited unlimited bytes
> Max file locks unlimited unlimited locks
> Max pending signals 14633 14633 signals
> Max msgqueue size 819200 819200 bytes
> Max nice priority 0 0
> Max realtime priority 0 0
> Max realtime timeout unlimited unlimited us
> {noformat}
> Looks like the problem is not in limits.
> Before load test:
> {noformat}
> cassandra-test0 ~]$ lsof -n | grep java | wc -l
> 166
> cassandra-test1 ~]$ lsof -n | grep java | wc -l
> 164
> cassandra-test2 ~]$ lsof -n | grep java | wc -l
> 180
> {noformat}
> After load test:
> {noformat}
> cassandra-test0 ~]$ lsof -n | grep java | wc -l
> 967
> cassandra-test1 ~]$ lsof -n | grep java | wc -l
> 1766
> cassandra-test2 ~]$ lsof -n | grep java | wc -l
> 2578
> {noformat}
> Most opened files have names like:
> {noformat}
> java 16890 cassandra 1636r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1637r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1638r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1639r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1640r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1641r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1642r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1643r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1644r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1645r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1646r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1647r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1648r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1649r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1650r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1651r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1652r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1653r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1654r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> java 16890 cassandra 1655r REG 202,17 161158485
> 655420 /var/lib/cassandra/data/system/paxos/system-paxos-jb-255-Data.db
> java 16890 cassandra 1656r REG 202,17 88724987
> 655520 /var/lib/cassandra/data/system/paxos/system-paxos-jb-644-Data.db
> {noformat}
> Also, when that happens it's not always possible to shutdown server process
> via SIGTERM. Have to use SIGKILL.
> p.s. See mailing thread for more context information
> https://www.mail-archive.com/[email protected]/msg33035.html
--
This message was sent by Atlassian JIRA
(v6.1#6144)