[ 
https://issues.apache.org/jira/browse/CASSANDRA-2552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025692#comment-13025692
 ] 

Sylvain Lebresne commented on CASSANDRA-2552:
---------------------------------------------

I am no expert of the Java Memory Model, but I can't find anything that 
preclude this behavior in the CHM docs either (there really is not much on the 
size function). So I would have liked the CHM solution if we could be sure it 
always fix that problem (I would have liked it because it was a one line change 
and I think maps are here to be "abused"), but as far as I can tell, it may 
well only make the bug much less frequent or fix it only on some architecture 
(the code of CHM seems to indicate it is safe but it's complicated enough that 
I wouldn't bet my life on it).

Note that if that's true, LBQ too could well allow for a race here without 
breaking it's specification (it seems to use a AtomicInteger for the size 
internally so it is trivially ok, but if the spec doesn't force anything, I 
suppose that could change).

So I suppose if we want to do right by the spec, we should probably update both 
AbstractRowResolver and RangeSliceResponseResolver (note that using an 
AtomicInteger to count the number of responses could be slightly simpler, but 
I'm fine with an AtomicReferenceArray). 

> ReadResponseResolver Race
> -------------------------
>
>                 Key: CASSANDRA-2552
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2552
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.8.0
>
>         Attachments: 0001-Move-Resolvers-to-atomic-append-count.txt, 
> ResolveRaceTest.java
>
>
> When receiving a response, ReadResponseResolver uses a 3 step process to 
> decide whether to trigger the condition that enough responses have arrived:
> # Add new response
> # Check response set size
> # Check that data is present
> I think that these steps must have been reordered by the compiler in some 
> cases, because I was able to reproduce a case for a QUORUM read where the 
> condition is not properly triggered:
> {noformat}
> INFO [RequestResponseStage:15] 2011-04-25 00:26:53,514 
> ReadResponseResolver.java (line 87) post append for 1087367065: hasData=false 
> in 2 messages
> INFO [RequestResponseStage:8] 2011-04-25 00:26:53,514 
> ReadResponseResolver.java (line 87) post append for 1087367065: hasData=true 
> in 1 messages
> INFO [pool-1-thread-54] 2011-04-25 00:27:03,516 StorageProxy.java (line 623) 
> Read timeout: java.util.concurrent.TimeoutException: 
> ReadResponseResolver@1087367065(/10.34.131.109=false,/10.34.132.122=true,)
> {noformat}
> The last line shows that both results were present, and that one of them was 
> holding data.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to