[ 
https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16029457#comment-16029457
 ] 

Andrés de la Peña commented on CASSANDRA-8272:
----------------------------------------------

bq. For the row-filter aware counter however, I don't think we can aford to 
have it not be a stopping transformation: we very much rely on that stopping to 
not OOM nodes (and generally read the whole database on a read), whether it be 
for user limits or paging. I'm not sure I understand why stopping it a concern 
in this case however?

If I understand it right, the idea is to have a {{DataLimits}} (associated to a 
{{RowFilter}}) that doesn't filter nor count rows that don't satisfy the 
filter. Any possible deleted index entry from a replica could be required to 
discard the possible stale results of another replica, so they shouldn't be 
filtered by {{DataLimits}}. So, if the query limit requires {{n}} rows, we 
should return not more than {{n}} rows satisfying the filter, and *all* the 
rows not satisfying the index but being pointed by a deleted index entry. Is 
this correct? If so, we can't stop reading when we have {{n}} rows satisfying 
the filter, we should keep reading the all the remaining rows pointed by 
deleted index entries, independently of the limit and with the subsequent 
impact on performance.


> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica 
> to return a stale result and that result will be sent back to the user, 
> potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are 
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before 
> having applied the insert, then the now stale result will be returned (since 
> C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and 
> provided we make the index inherit the gcGrace of it's parent CF), instead of 
> skipping that tombstone, we'd insert in the result a corresponding range 
> tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to