Ah, that explains part of the problem indeed. The whole situation still doesn't make a lot of sense to me, unless the answer is that the default sstable size with level compaction is just no good for large datasets. I restarted cassandra a few hours ago and it had to open about 32k files at start-up. Took about 15 minutes. That just can't be good...
I also noticed that when using compression the sstable size specified is uncompressed, so the actual files tend to be smaller. I now upped the sstable size to 100MB, which should result in about 40MB files in my case. Is there a way I can "compact" some of the existing sstables that are small? For example, I have a level-4 sstable that is 56KB in size and many more that are rather small. Does nodetool compact do anything with level compaction? On 1/18/2012 2:39 AM, Janne Jalkanen wrote: > > 1.0.6 has a file leak problem, fixed in 1.0.7. Perhaps this is the reason? > > https://issues.apache.org/jira/browse/CASSANDRA-3616 > > /Janne > > On Jan 18, 2012, at 03:52 , dir dir wrote: > >> Very Interesting.... Why you open so many file? Actually what kind of >> system that is built by you until open so many files? would you tell us? >> Thanks... >> >> >> On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken >> <t...@rightscale.com <mailto:t...@rightscale.com>> wrote: >> >> I'm running a single node cassandra 1.0.6 server which hit a wall >> yesterday: >> >> ERROR [CompactionExecutor:2918] 2012-01-12 20 >> <tel:2012-01-12%2020>:37:06,327 >> AbstractCassandraDaemon.java (line 133) Fatal exception in thread >> Thread[CompactionExecutor:2918,1,main] java.io.IOError: >> java.io.FileNotFoundException: >> /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db >> (Too many >> open files in system) >> >> After that it stopped working and just say there with this error >> (undestandable). I did an lsof and saw that it had 98567 open files, >> yikes! An ls in the data directory shows 234011 files. After >> restarting >> it spent about 5 hours compacting, then quieted down. About 173k >> files >> left in the data directory. I'm using leveldb (with compression). I >> looked into the json of the two large CFs and gen 0 is empty, most >> sstables are gen 3 & 4. I have a total of about 150GB of data >> (compressed). Almost all the SStables are around 3MB in size. Aren't >> they supposed to get 10x bigger at higher gen's? >> >> This situation can't be healthy, can it? Suggestions? >> >> >