Re: Documentation about TTL and tombstones

2024-03-18 Thread Sebastian Marsching

> It's actually correct to do it how it is today.
> Insertion date does not matter, what matters is the time after tombstones are 
> supposed to be deleted.
> If the delete got to all nodes, sure, no problem, but if any of the nodes 
> didn't get the delete, and you would get rid of the tombstones before running 
> a repair, you might have nodes that still has that data.
> Then following a repair, that data will be copied to other replicas, and that 
> data you thought you deleted, will be brought back to life.

Sure, for regular data that does not have a TTL, this makes sense. But I claim 
that data with a TTL is deleted when it is inserted. It’s just that this delete 
only becomes effective at some future date.

In order to understand whether data might reappear, we have to consider four 
cases. Let us first consider the three cases where the INSERT / UPDATE did not 
overwrite any existing data that would have lived longer than the new data:

1. Let us assume that the data is successfully written to all nodes and no 
repair is run. After the TTL expires, the data turns into a tombstone, but 
because the data was present on all nodes, the tombstone is present on all 
nodes, so there is no risk of data reappearing.

2. Let us assume that this data is not written to all nodes but a repair is run 
within the TTL. After that, we effectively have the first situation, so there 
is no risk of data reappearing.

3. Let us assume that this data is not written to all nodes and no repair is 
run within the TTL. After the TTL has passed, the data expires on the nodes 
where it has been written. Now, we have tombstones on these nodes. If we get 
rid of the tombstones, there is no risk of the data reappearing, because there 
are no nodes that have the data, so even if we run a repair in the future, 
there is no risk that the data magically reappears.

Now, let us consider the cases where data that either had no TTL or had a TTL 
that expired after the TTL of the newly inserted data was overwritten. Again, 
there are three possible scenarios:

4. Let us assume that the data is successfully written to all nodes and no 
repair is run. After the TTL expires, the data turns into a tombstone, but 
because the data was present on all nodes, the tombstone is present on all 
nodes, so there is no risk of data reappearing.

5. Let us assume that this data is not written to all nodes but a repair is run 
within the TTL. After that, we effectively have the first situation, so there 
is no risk of data reappearing.

6. Let us assume that this data is not written to all nodes and no repair is 
run within the TTL. After the TTL has passed, the data expires on the nodes 
where it has been written. Now, we have tombstones on these nodes. If we get 
rid of the tombstones, there is the risk of the data reappearing, because the 
older data that was overwritten by the INSERT / UPDATE might still exist on 
some nodes, and as the data with the TTL never made it to these nodes, there is 
no tombstone on these nodes and thus the older data can reappear.

So, we only have to worry about the last scenario. In this scenario, we have to 
ensure that either the inserted data with the TTL is repaired (which brings us 
back to scenario 5), or that the tombstones are repaired before they are 
discarded.

This is why I claim that for data with a TTL, gc_grace_seconds should 
effectively start when the data is inserted, not when it is converted into a 
tombstone: It does not matter whether the data with the TTL is repaired or the 
tombstone is repaired. As long as either of these things between the data with 
the TTL being inserted and the tombstone being reclaimed, there is no risk of 
deleted or overwritten data reappearing.



smime.p7s
Description: S/MIME cryptographic signature


Re: Documentation about TTL and tombstones

2024-03-17 Thread Gil Ganz
It's actually correct to do it how it is today.
Insertion date does not matter, what matters is the time after tombstones
are supposed to be deleted.
If the delete got to all nodes, sure, no problem, but if any of the nodes
didn't get the delete, and you would get rid of the tombstones before
running a repair, you might have nodes that still has that data.
Then following a repair, that data will be copied to other replicas, and
that data you thought you deleted, will be brought back to life.

On Sat, Mar 16, 2024 at 5:39 PM Sebastian Marsching 
wrote:

> > That's not how gc_grace_seconds work.
> > gc_grace_seconds controls how much time *after* a tombstone can be
> deleted, it can actually be deleted, in order to give you enough time to
> run repairs.
> >
> > Say you have data that is about to expire on March 16 8am, and
> gc_grace_seconds is 10 days.
> > After Mar 16 8am that data will be a tombstone, and only after March 26
> 8am, a compaction  *might* remove it, if all other conditions are met.
>
> You are right. I do not understand why it is implemented this way, but you
> are 100 % correct that it works this way.
>
> I thought that gc_grace_seconds is all about being able to repair the
> table before tombstones are removed, so that deleted data cannot repappear.
> But when the data has a TTL, it should not matter whether the original data
> ore the tombstone is synchronized as part of the repair process. After all,
> the original data should turn into a tombstone, so if it was present on all
> nodes, there is no risk of deleted data reappearing. Therefore, I think it
> would make more sense to start gc_grace_seconds when the data is inserted /
> updated. I don’t know why it was not implemented this way.
>
>


Re: Documentation about TTL and tombstones

2024-03-16 Thread Sebastian Marsching

> That's not how gc_grace_seconds work.
> gc_grace_seconds controls how much time *after* a tombstone can be deleted, 
> it can actually be deleted, in order to give you enough time to run repairs.
>
> Say you have data that is about to expire on March 16 8am, and 
> gc_grace_seconds is 10 days.
> After Mar 16 8am that data will be a tombstone, and only after March 26 8am, 
> a compaction  *might* remove it, if all other conditions are met.

You are right. I do not understand why it is implemented this way, but you are 
100 % correct that it works this way.

I thought that gc_grace_seconds is all about being able to repair the table 
before tombstones are removed, so that deleted data cannot repappear. But when 
the data has a TTL, it should not matter whether the original data ore the 
tombstone is synchronized as part of the repair process. After all, the 
original data should turn into a tombstone, so if it was present on all nodes, 
there is no risk of deleted data reappearing. Therefore, I think it would make 
more sense to start gc_grace_seconds when the data is inserted / updated. I 
don’t know why it was not implemented this way.



smime.p7s
Description: S/MIME cryptographic signature


Re: Documentation about TTL and tombstones

2024-03-16 Thread Gil Ganz
That's not how gc_grace_seconds work.
gc_grace_seconds controls how much time *after* a tombstone can be deleted,
it can actually be deleted, in order to give you enough time to run repairs.

Say you have data that is about to expire on March 16 8am, and
gc_grace_seconds is 10 days.
After Mar 16 8am that data will be a tombstone, and only after March 26
8am, a compaction  *might* remove it, if all other conditions are met.
gil


On Fri, Mar 15, 2024 at 12:58 AM Sebastian Marsching <
sebast...@marsching.com> wrote:

>
> by reading the documentation about TTL
>
> https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl
> It mention that it creates a tombstone when data expired, how does it
> possible without writing to the tombstone on the table ? I thought TTL
> doesn't create tombstones since the ttl is present together with the write
> time timestmap
> at the row level
>
>
> If you read carefully, you will notice that no tombstone is created and
> instead the data is *converted* into a tombstone. So, after the TTL has
> expired, the inserted data effectively acts as a tombstone. This is needed,
> because the now expired data might hide older data that has not expired
> yet. If the newer data was simply dropped after the TTL expired, older data
> might reappear.
>
> If I understand it correctly, you can avoid data with a TTL being
> converted into a tombstone by choosing a TTL that is greater than
> gc_grace_seconds. Technically, the data is still going to be converted into
> a tombstone when the TTL expires, but this tombstone will immediately be
> eligible for garbage collection.
>
>


Re: Documentation about TTL and tombstones

2024-03-14 Thread Sebastian Marsching

> by reading the documentation about TTL
> https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl
> It mention that it creates a tombstone when data expired, how does it  
> possible without writing to the tombstone on the table ? I thought TTL 
> doesn't create tombstones since the ttl is present together with the write 
> time timestmap
> at the row level

If you read carefully, you will notice that no tombstone is created and instead 
the data is *converted* into a tombstone. So, after the TTL has expired, the 
inserted data effectively acts as a tombstone. This is needed, because the now 
expired data might hide older data that has not expired yet. If the newer data 
was simply dropped after the TTL expired, older data might reappear.

If I understand it correctly, you can avoid data with a TTL being converted 
into a tombstone by choosing a TTL that is greater than gc_grace_seconds. 
Technically, the data is still going to be converted into a tombstone when the 
TTL expires, but this tombstone will immediately be eligible for garbage 
collection.



smime.p7s
Description: S/MIME cryptographic signature


Documentation about TTL and tombstones

2024-03-14 Thread Jean Carlo
Hello community,

by reading the documentation about TTL
https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl
It mention that it creates a tombstone when data expired, how does it
possible without writing to the tombstone on the table ? I thought TTL
doesn't create tombstones since the ttl is present together with the write
time timestmap
at the row level
Greetings

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay