I was digging into LCS code lately, and found the following comments (note the last paragraph "that would be ideal, but we can't"):
" // The problem is that L0 has a much higher score (almost 250) than L1 (11), so what we'll // do is compact a batch of MAX_COMPACTING_L0 sstables with all 117 L1 sstables, and put the // result (say, 120 sstables) in L1. Then we'll compact the next batch of MAX_COMPACTING_L0, // and so forth. So we spend most of our i/o rewriting the L1 data with each batch. // // If we could just do *all* L0 a single time with L1, that would be ideal. But we can't // -- see the javadoc for MAX_COMPACTING_L0." And then when I read the MAX_COMPACTING_L0 javadoc referenced above: " /** * limit the number of L0 sstables we do at once, because compaction bloom filter creation * uses a pessimistic estimate of how many keys overlap (none), so we risk wasting memory * or even OOMing when compacting highly overlapping sstables */" I'm starting to wonder if this is still a concern post C* 2.1 given that we've implemented CASSANDRA-5906. Here is an excerpt from Jonathan's blog post ( http://www.datastax.com/dev/blog/improving-compaction-in-cassandra-with-cardinality-estimation) on what motivated 5906 to be implemented: "Because bloom filters are not re-sizeable, we need to pre-allocate them at the start of the compaction, but at the start of the compaction, we don’t know how much the sstables being compacted overlap. Since bloom filter performance deteriorates dramatically when over-filled, we allocate our bloom filters to be large enough even if the sstables do not overlap at all. Which means that if they do overlap (which they should if compaction is doing a good job picking candidates), then we waste space — up to 100% per sstable compacted." Since we have 5906 to address this very issue for a few years, does it make sense now to revisit MAX_COMPACTING_L0 choice (hard coded to 32) since the "bloom filter wasting memory" concern is no longer there? I would imagine this could have the potential of improving backlogged LCS behavior when we have thousands of L0 SSTables. Thanks. -Wei