Arjun Ashok created CASSANDRA-21356:
---------------------------------------
Summary: CursorBasedCompaction: ReusableLivenessInfo.isExpiring()
incorrectly returns true for tombstone cells, corrupting cursor-compacted
SSTable format and cell reconciliation
Key: CASSANDRA-21356
URL: https://issues.apache.org/jira/browse/CASSANDRA-21356
Project: Apache Cassandra
Issue Type: Bug
Components: Local/Compaction, Local/SSTable
Reporter: Arjun Ashok
When the cursor compaction path encounters a tombstone cell (a column deleted
via INSERT ... null or DELETE), it incorrectly sets IS_EXPIRING_MASK in
addition to IS_DELETED_MASK. This causes an extra TTL delta byte to be written
to Data.db for every tombstone cell. Any SSTable reader relying on the format
invariant that IS_DELETED_MASK and IS_EXPIRING_MASK are mutually exclusive will
misalign on all bytes following a tombstone cell in a cursor-compacted SSTable.
The same root cause also corrupts cell reconciliation during merge. When two
SSTables hold the same column at the same timestamp (one as a tombstone, one as
an expiring cell) `resolveRegular()` uses `!isExpiring()` to identify
tombstones, which also returned false for tombstone cells due to the same bug.
Both cells appeared identical to the tie-breaking logic, which then fell
through to comparing localExpirationTime values. Since an expiring cell's
localExpirationTime is a future timestamp and a tombstone's is a past deletion
timestamp, the expiring cell always won, resurrecting an explicitly deleted
column after compaction.
The root cause is ReusableLivenessInfo.isExpiring(), which checked
localExpirationTime != NO_EXPIRATION_TIME instead of ttl != NO_TTL. Both
tombstone cells and expiring cells have a non-default localExpirationTime
(tombstones store the deletion timestamp there) so the check returned true for
both. This violates the LivenessInfo interface contract, which defines
isExpiring() as "whether the info has a ttl", and diverges from the canonical
implementation in AbstractCell.isExpiring() which correctly checks ttl() !=
NO_TTL.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]