[ 
https://issues.apache.org/jira/browse/CASSANDRA-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-8933.
-----------------------------------------
    Resolution: Not A Problem

As said in my previous comment, it appears this is actually not a problem.

> Short reads can return deleted results
> --------------------------------------
>
>                 Key: CASSANDRA-8933
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8933
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>
> The current code for short reads protection does not handle all cases.  
> Currently, we retry only if a node had returned the requested number of 
> results, but we have less results than that post-reconciliation, because this 
> means the node in question may have more results it hadn't sent due to the 
> limit.
> Consider however 3 nodes A, B, C (RF=3), and following sequence of operations 
> (all done at QUORUM):
> # we write 1 and 2 in a partition: all nodes get it.
> # we delete 1: only A and C get it.
> # we delete 2: only B and C get it.
> # we read the first row in the partition (so with a LIMIT 1) and A and B 
> answer first.
> At the last step, A will return the tombstone for 1 and the value 2, while B 
> will return just 1. So post reconciliation, we'll return 2 (since A returned 
> it and we have no tombstone for it), while we should return nothing. This is 
> a short read situation: B stopped at 1 because it was asked only 1 result, 
> but that result didn't made it in the result and we need further results from 
> it.  However, Because 1 results is requested and we have 1 result 
> post-reconciliation, the short read retry won't kick in.
> In practice, the short read check should be generalized: if any node X 
> returns the requested number of results but any of those results gets skipped 
> post-reconciliation, we might have a short read. Basically, enforcing the 
> limit replica-side is optimistic and assumes that all results of that replica 
> will be used, and as soon as that assumption fails we should get back more 
> results.
> Implementing that generalized condition can probably be done in 
> RowDataResolver.scheduleRepairs by using the repair to know if a node has had 
> some of results skipped by reconciliation but we want to know if a full CQL 
> row has been skipped or not so this will probably force us to add some 
> recounting.
> I'll note that I've fixed this problem on my branch for CASSANDRA-8099 (where 
> this is both simpler and somewhat more efficient since short reads don't 
> retry full queries there), so if decide this is too risky to fix in 2.1, we 
> can possibly just mark this as duplicate of CASSANDRA-8099.
> Lastly, it shouldn't be too hard to extends our current short read dtests to 
> test for that case, but I haven't taken the time to do so yet 
> ([~philipthompson] do you think you can have a look at adding such test at 
> some point?).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to