[ 
https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010354#comment-16010354
 ] 

Andrés de la Peña commented on CASSANDRA-8272:
----------------------------------------------

[~sbtourist], the proposed steps look good to me. 

[~slebresne], now I see that ignoring the limit before post processing has 
implications that I didn't take into account and that make it a bad idea which 
I retract. The reasoning behind trying to discard the "tombstone" rows using 
the post processor instead of the expression is that there could exist 
implementations where doing so could have a negative impact of performance, 
especially if they are already using the post processor for sorting or any 
other stuff. But, thinking it better, these implementations can rely on other 
methods to mitigate the performance cost of re-evaluating all the expression if 
the only requirement is to just discard tombstones.

Regarding making {{RowFilter.CustomExpression#isSatisfiedBy()}} abstract, we 
could provide a new {{Index#customExpressionFor(CFMetaData, ByteBuffer)}} 
method to let the index provide the custom expression implementation. This new 
method could have a default implementation returning a new 
{{RowFilter.CustomExpression}} with the same behaviour that we currently have, 
that is, an {{#isSatisfiedBy()}} implementation that always returns {{true}}. 
This way, for 3.x we'll keep compatibility while allowing custom index 
implementors that don't require incremental upgrades to implement the 
coordinator side of this bugfix. Then, for trunk, we could either remove the 
{{Index#customExpressionFor(CFMetaData cfm, ByteBuffer value)}} implementation 
or keep it as an ease for index implementations not interested in or able of 
implementing the coordinator side part of this. 
[Here|https://github.com/adelapena/cassandra/commit/34d3c7d0759c253d2b780b80e140930dc05cd591]
 is a draft patch showing the approach.

What do you think?

> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica 
> to return a stale result and that result will be sent back to the user, 
> potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are 
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before 
> having applied the insert, then the now stale result will be returned (since 
> C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and 
> provided we make the index inherit the gcGrace of it's parent CF), instead of 
> skipping that tombstone, we'd insert in the result a corresponding range 
> tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to