You are right about the behavior of cassandra compaction.
It checks if the key exists on the other SSTable files that are not in
the compaction set.

I think https://issues.apache.org/jira/browse/CASSANDRA-4671 would
help if you upgrade to latest 1.2,
but in your version, I think the workaround is to stop write not to
flush, then compact.

On Thu, May 16, 2013 at 3:07 AM, Boris Yen <yulin...@gmail.com> wrote:
> Hi All,
>
> Sorry for the wide distribution.
>
> Our cassandra is running on 1.0.10. Recently, we are facing a weird
> situation. We have a column family containing wide rows (each row might
> have a few million of columns). We delete the columns on a daily basis and
> we also run major compaction on it everyday to free up disk space (the
> gc_grace is set to 600 seconds).
>
> However, every time we run the major compaction, only 1 or 2GB disk space
> is freed. We tried to delete most of the data before running compaction,
> however, the result is pretty much the same.
>
> So, we tried to check the source code. It seems that the column tombstones
> could only be purged when the row key is not in other sstables. I know the
> major compaction should include all sstables, however, in our use case,
> columns get inserted rapidly. This will make the cassandra flush the
> memtables to disk and create new sstables. The newly created sstables will
> have the same keys as the sstables that are being compacted (the compaction
> will take 2 or 3 hours to finish). My question is that will these newly
> created sstables be the cause of why most of the column-tombstone not being
> purged?
>
> p.s. We also did some other tests. We inserted data to the same CF with the
> same wide-row pattern and deleted most of the data. This time we stopped
> all the writes to cassandra and did the compaction. The disk usage
> decreased dramatically.
>
> Any suggestions or is this a know issue.
>
> Thanks and Regards,
> Boris



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)

Reply via email to