On Wed, Jan 13, 2010 at 2:30 PM, Mark Robson <mar...@gmail.com> wrote:

> I also agree: Some mechanism to expire rolling data would be really good if
> we can incorporate it. Using the existing client interface, deleting old
> data is very cumbersome.
>
> We want to store lots of audit data in Cassandra, this will need to be
> expired eventually.
>
> Nodes should be able to do expiry locally without needing to talk to other
> nodes in the cluster. As we have a timestamp on everything anyway, can we
> not use that somehow?
>
> If we only ever append data rather than update it (or update it very
> rarely), can we somehow store timestamp ranges in each sstable file and then
> have the server know when it's time to expire one?
>
> I personally like this last option of expiring entire sstables. It seems
significantly more efficient then scrubbing data. The granularity might be a
bit high, but by columnfamily seems a reasonable trade-off in the short run
for an easier solution.

For apps that don't want to see the old data, during a read if the data had
a timestamp older than the expire time on the ColumnFamily it could also be
ignored, then when all in an sstable < x, truncate.

Logs are a great example of this.

- August

Reply via email to