Are you thinking about storing the expiration time explicitly? Or, would it be reasonable to calculate it dynamically?
-Kelvin On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > I think that is more or less what Sylvain is proposing. The main > downside is adding the extra 8 bytes for a long (or 4 for an int, > which should actually be plenty of resolution for this use case) to > each Column object. > > On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote: >> An alternative implementation that may be worth exploring would be to >> modify IColumn's isMarkedForDelete() method to check TTL. >> >> It probably wouldn't be as performant as straight dropping SSTables. >> You'd probably also need to periodically compact old tables to remove >> expired rows. However, on the surface, it appears to be a more >> seamless and fine-grained approach to this problem. >> >> -Kelvin >> >> A little more background: >> db.IColumn is the shared interface that db.Column and db.SuperColumn >> implement. db.Column's isMarkedForDelete() method only checks if a >> flag has been set, right now. So, it would be relatively >> straightforward to slip some logic into that method to check if its >> timestamp has expired beyond some TTL. >> >> However, I suspect that there may be other methods that may need to be >> slightly modified, as well. And, the compaction code would have to be >> inspected to make sure that old tables are periodically compacted to >> remove expired rows. >> >> On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson <mar...@gmail.com> wrote: >>> I also agree: Some mechanism to expire rolling data would be really good if >>> we can incorporate it. Using the existing client interface, deleting old >>> data is very cumbersome. >>> >>> We want to store lots of audit data in Cassandra, this will need to be >>> expired eventually. >>> >>> Nodes should be able to do expiry locally without needing to talk to other >>> nodes in the cluster. As we have a timestamp on everything anyway, can we >>> not use that somehow? >>> >>> If we only ever append data rather than update it (or update it very >>> rarely), can we somehow store timestamp ranges in each sstable file and then >>> have the server know when it's time to expire one? >>> >>> I'm guessing here from my limited understanding of how Cassandra works. >>> >>> Mark >>> >> >