Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-16 Thread graham sanderson
Just a few data points from our experience One of our use cases involves storing a periodic full base state for millions of records, then fairly frequent delta updates to subsets of the records in between. C* is great for this because we can read the whole row (or up to the clustering

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-16 Thread Paulo Ricardo Motta Gomes
Hello Kevin, In 2.0.X an SSTable is automatically dropped if it contains only tombstones: https://issues.apache.org/jira/browse/CASSANDRA-5228. However this will most likely happen if you use LCS. STCS will create sstables of larger size that will probably have mixed expired and unexpired data.

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-15 Thread Kevin Burton
We basically do this same thing in one of our production clusters, but rather than dropping SSTables, we drop Column Families. We time-bucket our CFs, and when a CF has passed some time threshold (metadata or embedded in CF name), it is dropped. This means there is a home-grown system that

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-14 Thread Jeremy Powell
Hi Kevin, C* version: 1.2.xx Astyanax: 1.56.xx We basically do this same thing in one of our production clusters, but rather than dropping SSTables, we drop Column Families. We time-bucket our CFs, and when a CF has passed some time threshold (metadata or embedded in CF name), it is dropped.

Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-12 Thread Kevin Burton
We have a log only data structure… everything is appended and nothing is ever updated. We should be totally fine with having lots of SSTables sitting on disk because even if we did a major compaction the data would still look the same. By 'lots' I mean maybe 1000 max. Maybe 1GB each. However,