Re: Insert with both TTL and timestamp behavior

Jeff Jirsa Fri, 30 Dec 2016 20:37:06 -0800

Your last sentence is correct - TWCS and dtcs add meaning (date/timestamp) to 
the long writetime that the rest of Cassandra ignores. If you're trying to 
backload data, you'll need to calculate the TTL yourself per write like you 
calculate the writetime.


The TTL behavior doesn't consider the client provided writetime at all, it's 
based on a delta from system time at the time of write.

The fact that Cassandra doesn't try to force meaning on writetime actually 
enables some great (but dangerous) data models where you use the writetime as 
the value for leaderboards and similar - dangerous and don't try it unless you 
actually understand why it works.


-- 
Jeff Jirsa


> On Dec 28, 2016, at 1:15 PM, Voytek Jarnot <voytek.jar...@gmail.com> wrote:
> 
> >It's not clear to me why for your use case you would want to manipulate the 
> >timestamps as you're loading the records unless you're concerned about 
> >conflicting writes getting applied in the correct order.
> 
> Simple use-case: want to load historical data, want to use TWCS, want to use 
> TTL.
> 
> Scenario:
> Importing data using standard write path (inserts)
> Using timestamp to give TWCS something to work with (import records contain a 
> created-on timestamp from which I populate "using timestamp")
> Need records to expire according to TTL
> Don't want to calculate TTL for every insert individually (obviously what I 
> want and what I get differ)
> I'm importing in chrono order, so TWCS should be able to keep things from 
> getting out of hand.
> 
> >I think in general timestamp manipulation is caveat utilitor.
> 
> Yeah; although I'd probably choose stronger words. TWCS (and perhaps DTCS?) 
> appears to treat writetimes as timestamps; the rest of Cassandra appears to 
> treat them as integers.
> 
> 
>> On Wed, Dec 28, 2016 at 2:50 PM, Eric Stevens <migh...@gmail.com> wrote:
>> The purpose of timestamps is to guarantee out-of-order conflicting writes 
>> are resolved as last-write-wins.  Cassandra doesn't really expect you to be 
>> writing timestamps with wide variations from record to record.  Indeed, if 
>> you're doing this, it'll violate some of the assumptions in places such as 
>> time windowed / date tiered compaction.  It's possible to dodge those 
>> landmines but it would be hard to know if you got it wrong.
>> 
>> I think in general timestamp manipulation is caveat utilitor.  It's not 
>> clear to me why for your use case you would want to manipulate the 
>> timestamps as you're loading the records unless you're concerned about 
>> conflicting writes getting applied in the correct order. 
>> 
>> Probably worth a footnote in the documentation indicating that if you're 
>> doing both USING TTL and WITH TIMESTAMP that those don't relate to each 
>> other.  At rest TTL'd records get written with an expiration timestamp, not 
>> a delta from the writetime.
>> 
>>> On Wed, Dec 28, 2016 at 9:38 AM Voytek Jarnot <voytek.jar...@gmail.com> 
>>> wrote:
>>> It appears as though, when inserting with "using ttl [foo] and timestamp 
>>> [bar]" that the TTL does not take the provided timestamp into account.
>>> 
>>> In other words, the TTL starts at insert time, not at the time specified by 
>>> the timestamp.
>>> 
>>> Similarly, if inserting with just "using timestamp [bar]" and relying on 
>>> the table's default_time_to_live property, the timestamp is again ignored 
>>> in terms of TTL expiration.
>>> 
>>> Seems like a bug to me, but I'm guessing this is intended behavior?
>>> 
>>> Use-case is importing data (some of it historical) and setting the 
>>> timestamp manually (based on a timestamp within the data itself). Anyone 
>>> familiar with any work-arounds that don't rely on calculating a TTL 
>>> client-side for each record?
>

smime.p7s
Description: S/MIME cryptographic signature

Re: Insert with both TTL and timestamp behavior

Reply via email to