Hi Donald, I was reporting the ticket you mentioned, so I kinds feel like I should answer this :-)
I presume the point is that GCable tombstones can still do work > (preventing spurious writing from nodes that were down) but only until the > data is flushed to disk. > I am not sure I understand this correctly. Could you rephrase that sentence? > If the effective TTL exceeds gc_grace_seconds then the tombstone will be > deleted anyway. > Its not even written (since CASSANDRA-4917). There is no delete on the tombstone in that case. > It occurred to me that if you never update the TTL of a column, then > there should be no need for tombstones at all: any replicas will have the > same TTL. So there'd be no risk of missed deletes. You wouldn't even need > GCable tombstones > I think so too. There should be no need for a tombstone at all if the following condition are given: - column was not deleted manually, but timed out by itself - column was not updated in the last gc_grace days If I am not mistaken, the second point would even be neccessary for CASSANDRA-4917 to be able to handle changing TTLs correctly: I think the current implementation might break, if a column gets updated with a smaller TTL, or to be more precise when (old.creationdate + old.ttl) < (new.creationdate + new.ttl) && new.ttl < gc_grace Imho, for any further tombstone-optimization to work, compaction would have to be smarter: I think it should be able to track max(old.creationdate + old.ttl , new.creationdate + new.ttl) when merging columns. I have no idea if that is possible though. > > So, if - and it's a big if - a table disallowed updates to TTL, then you > could really optimize deletion of TTLed columns: you could do away with > tombstones entirely. If a table allows updates to TTL then it's possible > a different node will have the row without the TTL and the tombstone would > be needed. > I am not sure I understand this. My "thrift" understanding of cassandra is that you cannot update the TTL, you can just update an entire column. Also each column has its own TTL. There is no TTL on the row. cheers, Christian