ZhaoYang created CASSANDRA-15640:
------------------------------------

             Summary: digest may not match when single partition named queries 
skip older sstables
                 Key: CASSANDRA-15640
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15640
             Project: Cassandra
          Issue Type: Bug
          Components: Legacy/Local Write-Read Paths
            Reporter: ZhaoYang


Name queries (aka. single partition query with full clustering keys) query 
sstables sequentially in recency order, in the hope that most recent sstables 
will contain most recent data, so that they can avoid reading older sstables in 
{{SinglePartitionReadCommand#reduceFilter}}.

Unfortunately, this optimization may cause digest mismatch if older sstables 
contain range tombstone or row deletion with lower timestamp. [Test 
Code|https://github.com/jasonstack/cassandra/commit/3dfa29bb34bc237ab2b68f849906c09569c5cc94]
{code:java}
Table with (pk, ck1, ck2)

Node1:
* delete row (pk=1, ck1=1) with ts=10
* insert row (pk=1, ck1=1, ck2=1) with ts=11

Node2:
* delete row (pk=1, ck1=1) with ts=10
* flush into sstable1
* insert row (pk=1, ck1=1, ck2=1) with ts=11
* flush into sstable2

Query with pk=1 and ck1=1 and ck2=1
* node1 returns: RT open marker, row, RT close marker
* node2 returns: row  (because sstable1 is skipped)

Note: similar mismatch can happen with row deletion as well.
{code}
In the above example: Is it safe to ignore RT or row deletion if row liveness 
has higher timestamp for named queries in node1?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to