Re: Nondeterministic outcome based on cell TTL and major compaction event order

Michael Segel Sat, 18 Apr 2015 05:05:02 -0700

I said barring max versions… (in an earlier post on the thread.) 

> On Apr 17, 2015, at 6:52 PM, Sean Busbey <[email protected]> wrote:
> 
> If you have max versions set to 1 (the default), then c1 should be removed
> at compaction time if c2 still exists then.
> 
> -- 
> Sean
> On Apr 17, 2015 6:41 PM, "Michael Segel" <[email protected]> wrote:
> 
>> Ok,
>> So then if you have a previous cell (c1) and you insert a new cell c2 that
>> has a TTL of lets say 5 mins, then c1 should always exist?
>> That is my understanding but from Cosmin’s post, he’s saying its
>> different.  And that’s why I don’t understand.  You couldn’t lose the cell
>> c1 at all.
>> Compaction or no compaction.
>> 
>> That’s why I’m confused.  Current behavior doesn’t match the expected
>> contract.
>> 
>> -Mike
>> 
>>> On Apr 17, 2015, at 4:37 PM, Andrew Purtell <[email protected]> wrote:
>>> 
>>> The way TTLs work today is they define the interval of time a cell
>>> exists - exactly as that. There is no tombstone laid like a normal
>>> delete. Once the TTL elapses the cell just ceases to exist to normal
>>> scanners. The interaction of expired cells, multiple versions, minimum
>>> versions, raw scanners, etc. can be confusing. We can absolutely
>>> revisit this.
>>> 
>>> A cell with an expired TTL could be treated as the combination of
>>> tombstone and the most recent value it lays over. This is not how the
>>> implementation works today, but could be changed for an upcoming major
>>> version like 2.0 if there's consensus to do it.
>>> 
>>> 
>>>> On Apr 10, 2015, at 7:26 AM, Cosmin Lehene <[email protected]> wrote:
>>>> 
>>>> I've been initially puzzled by this, although I realize how it's likely
>> as designed.
>>>> 
>>>> 
>>>> The cell TTL expiration and compactions events can lead to either some
>> (the older) data left or no data at all for a particular  (row, family,
>> qualifier, ts) coordinate.
>>>> 
>>>> 
>>>> 
>>>> Write (r1, f1, q1, v1, 1)
>>>> 
>>>> Write (r1, f1, q1, v1, 2) - TTL=1 minute
>>>> 
>>>> 
>>>> Scenario 1:
>>>> 
>>>> 
>>>> If a major compaction happens within a minute
>>>> 
>>>> 
>>>> it will remove (r1, f1, q1, v1, 1)
>>>> 
>>>> then after a minute (r1, f1, q1, v1, 2) will expire
>>>> 
>>>> no data left
>>>> 
>>>> 
>>>> Scenario 2:
>>>> 
>>>> 
>>>> A minute passes
>>>> 
>>>> (r1, f1, q1, v1, 2) expires
>>>> 
>>>> Compaction runs..
>>>> 
>>>> (r1, f1, q1, v1, 1) remains
>>>> 
>>>> 
>>>> 
>>>> This seems, by and large expected behavior, but it still seems
>> "uncomfortable" that the (overall) outcome is not decided by me, but by a
>> chance of event ordering.
>>>> 
>>>> 
>>>> I wonder we'd want this to behave differently (perhaps it has been
>> discussed already), but if not, it's worth a more detailed documentation in
>> the book.
>>>> 
>>>> 
>>>> What do you think?
>>>> 
>>>> 
>>>> Cosmin
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>> Hein (via Tom White)
>>> 
>> 
>> The opinions expressed here are mine, while they may reflect a cognitive
>> thought, that is purely accidental.
>> Use at your own risk.
>> Michael Segel
>> michael_segel (AT) hotmail.com
>> 
>> 
>> 
>> 
>> 
>>


The opinions expressed here are mine, while they may reflect a cognitive 
thought, that is purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

Re: Nondeterministic outcome based on cell TTL and major compaction event order

Reply via email to