[
https://issues.apache.org/jira/browse/CASSANDRA-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362536#comment-14362536
]
Anthony Fisk commented on CASSANDRA-8964:
-----------------------------------------
Thanks. We'll definitely be increasing the number of file handles.
The only other exceptions I see in the logs are in the form of the following:
{code}
ERROR [IndexSummaryManager:1] 2015-03-11 04:43:54,201 CassandraDaemon.java:153
- Exception in thread Thread[IndexSummaryManager:1,1,main]
java.lang.AssertionError: null
at org.apache.cassandra.io.util.Memory.size(Memory.java:307)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.IndexSummary.getOffHeapSize(IndexSummary.java:192)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableReader.getIndexSummaryOffHeapSize(SSTableReader.java:1070)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:292)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:77)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
[na:1.7.0_51]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
[na:1.7.0_51]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
[na:1.7.0_51]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
{code}
They occurred at the following times:
* 2015-03-10 08:41:34,647
* 2015-03-11 04:43:54,201
* 2015-03-11 23:38:03,741
* 2015-03-12 02:38:05,106
* 2015-03-12 03:38:05,620
* 2015-03-12 04:38:05,643
> SSTable count rises during compactions and max open files exceeded
> ------------------------------------------------------------------
>
> Key: CASSANDRA-8964
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8964
> Project: Cassandra
> Issue Type: Bug
> Environment: Apache Cassandra 2.1.2
> Centos 6
> AWS EC2
> i2.2xlarge
> Reporter: Anthony Fisk
> Priority: Critical
> Attachments: lsof_with_tmp.txt, lsof_without_tmp.txt,
> nodetool_cfstats.zip
>
>
> LCS compaction was not able to keep up with the prolonged insert load on one
> of our tables called log, resulting in 2,185 SSTables for that table and
> 1,779 pending compactions all together during a test we were running.
> We stopped our load, unthrottled compaction throughput, increased the
> concurrent compactors from 2 to 8, and let it compact the SSTables.
> All was going well until the number of SSTables count for our log table got
> down to around 97, then began rising again until it had reached 758 SSTables
> 1.5 hours later... (we've been recording the cfstats output every half hour,
> [attached|^nodetool_cfstats.zip])
> Eventually we exceeded the number of open files:
> {code}
> ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,748
> CassandraDaemon.java:153 - Exception in thread
> Thread[MemtableFlushWriter:286,5,main]
> java.lang.RuntimeException: java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_51]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
> Caused by: java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
> ~[na:1.7.0_51]
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> ... 14 common frames omitted
> ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,750
> JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting
> forcefully due to:
> java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
> ~[na:1.7.0_51]
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_51]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
> {code}
> I restarted Cassandra and the number of SSTables reduced to 81 before
> increasing again... Looking at lsof output, it seemed that the file handles
> for the log table were also continually increasing. It seemed that roughly
> half of them [were for "tmp" files|^lsof_with_tmp.txt] and the other half
> [were for the directory of the table|^lsof_without_tmp.txt]. Also of note, it
> seemed we only had one compactor running on the log table, rather than the 8
> that we had seen before after changing the configuration and restarting.
> Cassandra eventually crashed again with the same error when it reached 6,442
> open files (seen in lsof). Without Cassandra running our system has 1,126
> open files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)