[
https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030870#comment-16030870
]
Sylvain Lebresne commented on CASSANDRA-8272:
---------------------------------------------
bq. So, if the query limit requires n rows, we should return not more than
{{n}} rows satisfying the filter, and *all* the rows not satisfying the index
but being pointed by a deleted index entry.
No, I don't think we have to return *all* the rows not satisfying the index. I
believe only returning those that are *before* the {{n}} th "valid" entry is
enough. I don't think it's different from how we handle tombstones here: we
don't return all tombstones, just the ones before the {{n}} th live results.
Note that both with those new "invalid" entries and with tombstones, it's
possible that post-resolution on the coordinator we end up being short on
results. That is, a "valid" result from A is canceled by a tombstone/"invalid"
result of B and vice-versa and we end up with less results than requested. But
that's where the short-read protection from {{DataResolver}} kicks in.
> 2ndary indexes can return stale data
> ------------------------------------
>
> Key: CASSANDRA-8272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sylvain Lebresne
> Assignee: Andrés de la Peña
> Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica
> to return a stale result and that result will be sent back to the user,
> potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before
> having applied the insert, then the now stale result will be returned (since
> C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and
> provided we make the index inherit the gcGrace of it's parent CF), instead of
> skipping that tombstone, we'd insert in the result a corresponding range
> tombstone.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]