[ 
https://issues.apache.org/jira/browse/CASSANDRA-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164463#comment-16164463
 ] 

Kurt Greaves commented on CASSANDRA-13819:
------------------------------------------

This is certainly the intended behaviour, but you're right and if it's not 
explicitly documented already it should be. I've seen people be misled by this 
before as well.
Would probably be worthwhile noting the behaviour and also linking to a more in 
depth run through of timestamps/USING TIMESTAMP in general, probably with some 
examples.
Is this something you would be willing to contribute?

> Surprising under-documented behavior with DELETE...USING TIMESTAMP
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-13819
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13819
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Eric Wolak
>            Priority: Minor
>
> While investigating differences between various Bigtable derivatives, I‘ve 
> run into an odd behavior of Cassandra. I’m guessing this is intended 
> behavior, but it's surprising enough to me that I think it should be 
> explicitly documented.
> Let‘s say I have a sensor device reporting data with timestamps. It has a 
> great clock, so I use its timestamps in a USING TIMESTAMP clause in my INSERT 
> statements. One day Jeff realizes that we had a hardware bug with the sensor, 
> and data before timestamp T is incorrect. He issues a DELETE...USING 
> TIMESTAMP T to remove the old data. In the meantime, Sam figures out a way to 
> backfill the data, and she writes a job to insert corrected data into the 
> same table. In keeping with the schema, her job issues INSERT...USING 
> TIMESTAMP statememts, with timestamps before T (because that’s the time the 
> data points correspond to). When testing her job, Sam discovers that the 
> backfilled data isn‘t appearing in the database! In fact, there’s no way for 
> her to insert data with a TIMESTAMP <= T, because the tombstone written by 
> Jeff several days ago is masking them. How can Sam backfill the corrected 
> data?
> This behavior seems to match the HBase “Current Limitation” that Deletes Mask 
> Puts, documented at http://hbase.apache.org/book.html#_deletes_mask_puts. 
> Should the Cassandra docs also explicitly call-out this behavior?
> Related:
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
> https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to