Re: Cassandra and TTL

Jonathan Ellis Wed, 13 Jan 2010 15:20:47 -0800

If he needs column-level granularity then I don't see any other option.

If he needs CF-level granularity then truncate will work fine. :)


On Wed, Jan 13, 2010 at 5:16 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote:
> Are you thinking about storing the expiration time explicitly?  Or,
> would it be reasonable to calculate it dynamically?
>
> -Kelvin
>
> On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>> I think that is more or less what Sylvain is proposing.  The main
>> downside is adding the extra 8 bytes for a long (or 4 for an int,
>> which should actually be plenty of resolution for this use case) to
>> each Column object.
>>
>> On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote:
>>> An alternative implementation that may be worth exploring would be to
>>> modify IColumn's isMarkedForDelete() method to check TTL.
>>>
>>> It probably wouldn't be as performant as straight dropping SSTables.
>>> You'd probably also need to periodically compact old tables to remove
>>> expired rows.  However, on the surface, it appears to be a more
>>> seamless and fine-grained approach to this problem.
>>>
>>> -Kelvin
>>>
>>> A little more background:
>>> db.IColumn is the shared interface that db.Column and db.SuperColumn
>>> implement.  db.Column's isMarkedForDelete() method only checks if a
>>> flag has been set, right now.  So, it would be relatively
>>> straightforward to slip some logic into that method to check if its
>>> timestamp has expired beyond some TTL.
>>>
>>> However, I suspect that there may be other methods that may need to be
>>> slightly modified, as well.  And, the compaction code would have to be
>>> inspected to make sure that old tables are periodically compacted to
>>> remove expired rows.
>>>
>>> On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson <mar...@gmail.com> wrote:
>>>> I also agree: Some mechanism to expire rolling data would be really good if
>>>> we can incorporate it. Using the existing client interface, deleting old
>>>> data is very cumbersome.
>>>>
>>>> We want to store lots of audit data in Cassandra, this will need to be
>>>> expired eventually.
>>>>
>>>> Nodes should be able to do expiry locally without needing to talk to other
>>>> nodes in the cluster. As we have a timestamp on everything anyway, can we
>>>> not use that somehow?
>>>>
>>>> If we only ever append data rather than update it (or update it very
>>>> rarely), can we somehow store timestamp ranges in each sstable file and 
>>>> then
>>>> have the server know when it's time to expire one?
>>>>
>>>> I'm guessing here from my limited understanding of how Cassandra works.
>>>>
>>>> Mark
>>>>
>>>
>>
>

Re: Cassandra and TTL

Reply via email to