Are you thinking about storing the expiration time explicitly?  Or,
would it be reasonable to calculate it dynamically?

-Kelvin

On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> I think that is more or less what Sylvain is proposing.  The main
> downside is adding the extra 8 bytes for a long (or 4 for an int,
> which should actually be plenty of resolution for this use case) to
> each Column object.
>
> On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote:
>> An alternative implementation that may be worth exploring would be to
>> modify IColumn's isMarkedForDelete() method to check TTL.
>>
>> It probably wouldn't be as performant as straight dropping SSTables.
>> You'd probably also need to periodically compact old tables to remove
>> expired rows.  However, on the surface, it appears to be a more
>> seamless and fine-grained approach to this problem.
>>
>> -Kelvin
>>
>> A little more background:
>> db.IColumn is the shared interface that db.Column and db.SuperColumn
>> implement.  db.Column's isMarkedForDelete() method only checks if a
>> flag has been set, right now.  So, it would be relatively
>> straightforward to slip some logic into that method to check if its
>> timestamp has expired beyond some TTL.
>>
>> However, I suspect that there may be other methods that may need to be
>> slightly modified, as well.  And, the compaction code would have to be
>> inspected to make sure that old tables are periodically compacted to
>> remove expired rows.
>>
>> On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson <mar...@gmail.com> wrote:
>>> I also agree: Some mechanism to expire rolling data would be really good if
>>> we can incorporate it. Using the existing client interface, deleting old
>>> data is very cumbersome.
>>>
>>> We want to store lots of audit data in Cassandra, this will need to be
>>> expired eventually.
>>>
>>> Nodes should be able to do expiry locally without needing to talk to other
>>> nodes in the cluster. As we have a timestamp on everything anyway, can we
>>> not use that somehow?
>>>
>>> If we only ever append data rather than update it (or update it very
>>> rarely), can we somehow store timestamp ranges in each sstable file and then
>>> have the server know when it's time to expire one?
>>>
>>> I'm guessing here from my limited understanding of how Cassandra works.
>>>
>>> Mark
>>>
>>
>

Reply via email to