Just a few data points from our experience
One of our use cases involves storing a periodic full base state for millions
of records, then fairly frequent delta updates to subsets of the records in
between. C* is great for this because we can read the whole row (or up to the
clustering
Hello Kevin,
In 2.0.X an SSTable is automatically dropped if it contains only
tombstones: https://issues.apache.org/jira/browse/CASSANDRA-5228. However
this will most likely happen if you use LCS. STCS will create sstables of
larger size that will probably have mixed expired and unexpired data.
We basically do this same thing in one of our production clusters, but
rather than dropping SSTables, we drop Column Families. We time-bucket our
CFs, and when a CF has passed some time threshold (metadata or embedded in
CF name), it is dropped. This means there is a home-grown system that
Hi Kevin,
C* version: 1.2.xx
Astyanax: 1.56.xx
We basically do this same thing in one of our production clusters, but
rather than dropping SSTables, we drop Column Families. We time-bucket our
CFs, and when a CF has passed some time threshold (metadata or embedded in
CF name), it is dropped.
We have a log only data structure… everything is appended and nothing is
ever updated.
We should be totally fine with having lots of SSTables sitting on disk
because even if we did a major compaction the data would still look the
same.
By 'lots' I mean maybe 1000 max. Maybe 1GB each.
However,