Re: Major compaction does not seems to free the disk space a lot if wide rows are used.

2013-05-18 Thread Boris Yen
Thank you for the reply. It is really helpful. We will take a look at the patch to see if we could apply it on 1.0 branch or try to workaround it by changing our application implementation. Regards, Boris On Thu, May 16, 2013 at 10:43 PM, Yuki Morishita mor.y...@gmail.com wrote: You are right

Major compaction does not seems to free the disk space a lot if wide rows are used.

2013-05-16 Thread Boris Yen
Hi All, Sorry for the wide distribution. Our cassandra is running on 1.0.10. Recently, we are facing a weird situation. We have a column family containing wide rows (each row might have a few million of columns). We delete the columns on a daily basis and we also run major compaction on it

Re: Major compaction does not seems to free the disk space a lot if wide rows are used.

2013-05-16 Thread Yuki Morishita
You are right about the behavior of cassandra compaction. It checks if the key exists on the other SSTable files that are not in the compaction set. I think https://issues.apache.org/jira/browse/CASSANDRA-4671 would help if you upgrade to latest 1.2, but in your version, I think the workaround is

Re: Major compaction does not seems to free the disk space a lot if wide rows are used.

2013-05-16 Thread Edward Capriolo
This makes sense. Unless you are running major compaction a delete could only happen if the bloom filters confirmed the row was not in the sstables not being compacted. If your rows are wide the odds are that they are in most/all sstables and then finally removing them would be tricky. On Thu,