Done https://issues.apache.org/jira/browse/CASSANDRA-604
On Fri, Dec 4, 2009 at 4:01 PM, Jonathan Ellis <[email protected]> wrote: > Please do. > > On Fri, Dec 4, 2009 at 5:53 PM, Ramzi Rabah <[email protected]> wrote: >> Thanks Jonathan. >> Should I open a bug for this? >> >> Ray >> >> On Fri, Dec 4, 2009 at 3:47 PM, Jonathan Ellis <[email protected]> wrote: >>> On Fri, Dec 4, 2009 at 5:32 PM, Ramzi Rabah <[email protected]> wrote: >>>> Starting with fresh directories with no data and trying to do simple >>>> inserts, I could not reproduce it *sigh*. Nothing is simple :(, so I >>>> decided to dig deeper into the code. >>>> >>>> I was looking at the code for compaction, and this is a very noob >>>> concern, so please bare with me if I'm way off, this code is all new >>>> to me. When we are doing compactions during the normal course of >>>> cassandra, we call: >>>> >>>> for (List<SSTableReader> sstables : >>>> getCompactionBuckets(ssTables_, 50L * 1024L * 1024L)) >>>> { >>>> if (sstables.size() < minThreshold) >>>> { >>>> continue; >>>> } >>>> other wise docompactions... >>>> >>>> where getCompactionBuckets puts in buckets very small files, or files >>>> that are 0.5-1.5 of each other's sizes. It will only compact those if >>>> they are >= minimum threshold which is 4 by default. >>> >>> Exactly right. >>> >>>> So far so good. Now how about this scenario, I have an old entry that >>>> I inserted long time ago and that was compacted into a 75MB file. >>>> There are fewer 75MB files than 4. I do many deletes, and I end with 4 >>>> extra sstable files filled with tombstones, each about 300 MB large. >>>> These 4 files are compacted together and in the compaction code, if >>>> the tombstone is there we don't copy it over to the new file. Now >>>> since we did not compact the 75MB files, but we compacted the >>>> tombstone files, doesn't that leave us with the tombstone gone, but >>>> the data still intact in the 75MB file? >>> >>> Also right. Glad you had a look! :) >>> >>> One relatively easy fix would be to only GC the tombstones if there >>> are no SSTables left for that CF older than the ones being compacted. >>> (So, a "major" compaction, which compacts all SSTables and is what >>> nodeprobe invokes, would always GC eligible tombstones.) >>> >>> -Jonathan >>> >> >
