[
https://issues.apache.org/jira/browse/CASSANDRA-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anthony Fisk updated CASSANDRA-8964:
------------------------------------
Description:
LCS compaction was not able to keep up with the prolonged insert load on one of
our tables called log, resulting in 2,185 SSTables for that table and 1,779
pending compactions all together during a test we were running.
We stopped our load, unthrottled compaction throughput, increased the
concurrent compactors from 2 to 8, and let it compact the SSTables.
All was going well until the number of SSTables count for our log table got
down to around 97, then began rising again until it had reached 758 SSTables
1.5 hours later... (we've been recording the cfstats output every half hour,
[attached|^nodetool_cfstats.zip])
Eventually we exceeded the number of open files:
{code}
ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,748
CassandraDaemon.java:153 - Exception in thread
Thread[MemtableFlushWriter:286,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
Caused by: java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
~[na:1.7.0_51]
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
~[apache-cassandra-2.1.2.jar:2.1.2]
... 14 common frames omitted
ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,750
JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting
forcefully due to:
java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
~[na:1.7.0_51]
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
{code}
I restarted Cassandra and the number of SSTables reduced to 81 before
increasing again... Looking at lsof output, it seemed that the file handles for
the log table were also continually increasing. It seemed that roughly half of
them [were for "tmp" files|^lsof_with_tmp.txt] and the other half [were for the
directory of the table|^lsof_without_tmp.txt]. Also of note, it seemed we only
had one compactor running on the log table, rather than the 8 that we had seen
before after changing the configuration and restarting.
Cassandra eventually crashed again with the same error when it reached 6,442
open files (seen in lsof). Without Cassandra running our system has 1,126 open
files.
was:
LCS compaction was not able to keep up with the prolonged insert load on one of
our tables called log, resulting in 2,185 SSTables for that table and 1,779
pending compactions all together during a test we were running.
We stopped our load, unthrottled compaction throughput, increased the
concurrent compactors from 2 to 8, and let it compact the SSTables.
All was going well until the number of SSTables count for our log table got
down to around 97, then began rising again until it had reached 758 SSTables
1.5 hours later... (we've been recording the cfstats output every half hour,
[attached|nodetool_cfstats.zip])
Eventually we exceeded the number of open files:
{code}
ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,748
CassandraDaemon.java:153 - Exception in thread
Thread[MemtableFlushWriter:286,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
Caused by: java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
~[na:1.7.0_51]
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
~[apache-cassandra-2.1.2.jar:2.1.2]
... 14 common frames omitted
ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,750
JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting
forcefully due to:
java.io.FileNotFoundException:
/mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
~[na:1.7.0_51]
at
org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
~[guava-16.0.jar:na]
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
~[apache-cassandra-2.1.2.jar:2.1.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
{code}
I restarted Cassandra and the number of SSTables reduced to 81 before
increasing again... Looking at lsof output, it seemed that the file handles for
the log table were also continually increasing. It seemed that roughly half of
them [were for "tmp" files|lsof_with_tmp.txt] and the other half [were for the
directory of the table|lsof_without_tmp.txt]. Also of note, it seemed we only
had one compactor running on the log table, rather than the 8 that we had seen
before after changing the configuration and restarting.
Cassandra eventually crashed again with the same error when it reached 6,442
open files (seen in lsof). Without Cassandra running our system has 1,126 open
files.
> SSTable count rises during compactions and max open files exceeded
> ------------------------------------------------------------------
>
> Key: CASSANDRA-8964
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8964
> Project: Cassandra
> Issue Type: Bug
> Environment: Apache Cassandra 2.1.2
> Centos 6
> AWS EC2
> i2.2xlarge
> Reporter: Anthony Fisk
> Priority: Critical
> Attachments: lsof_with_tmp.txt, lsof_without_tmp.txt,
> nodetool_cfstats.zip
>
>
> LCS compaction was not able to keep up with the prolonged insert load on one
> of our tables called log, resulting in 2,185 SSTables for that table and
> 1,779 pending compactions all together during a test we were running.
> We stopped our load, unthrottled compaction throughput, increased the
> concurrent compactors from 2 to 8, and let it compact the SSTables.
> All was going well until the number of SSTables count for our log table got
> down to around 97, then began rising again until it had reached 758 SSTables
> 1.5 hours later... (we've been recording the cfstats output every half hour,
> [attached|^nodetool_cfstats.zip])
> Eventually we exceeded the number of open files:
> {code}
> ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,748
> CassandraDaemon.java:153 - Exception in thread
> Thread[MemtableFlushWriter:286,5,main]
> java.lang.RuntimeException: java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:75)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_51]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
> Caused by: java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
> ~[na:1.7.0_51]
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> ... 14 common frames omitted
> ERROR [MemtableFlushWriter:286] 2015-03-12 13:44:36,750
> JVMStabilityInspector.java:94 - JVM state determined to be unstable. Exiting
> forcefully due to:
> java.io.FileNotFoundException:
> /mnt/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-tmp-ka-6618-Index.db
> (Too many open files)
> at java.io.RandomAccessFile.open(Native Method) ~[na:1.7.0_51]
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
> ~[na:1.7.0_51]
> at
> org.apache.cassandra.io.util.SequentialWriter.<init>(SequentialWriter.java:71)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:104)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.SequentialWriter.open(SequentialWriter.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:552)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:134)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.createFlushWriter(Memtable.java:390)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:329)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:313)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> ~[guava-16.0.jar:na]
> at
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1037)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_51]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
> {code}
> I restarted Cassandra and the number of SSTables reduced to 81 before
> increasing again... Looking at lsof output, it seemed that the file handles
> for the log table were also continually increasing. It seemed that roughly
> half of them [were for "tmp" files|^lsof_with_tmp.txt] and the other half
> [were for the directory of the table|^lsof_without_tmp.txt]. Also of note, it
> seemed we only had one compactor running on the log table, rather than the 8
> that we had seen before after changing the configuration and restarting.
> Cassandra eventually crashed again with the same error when it reached 6,442
> open files (seen in lsof). Without Cassandra running our system has 1,126
> open files.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)