[ 
https://issues.apache.org/jira/browse/CASSANDRA-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095403#comment-13095403
 ] 

Sylvain Lebresne commented on CASSANDRA-2610:
---------------------------------------------

bq. use concurrent structures instead of synchronized

I think this doesn't work, or need modifications elsewhere. The synchronized 
was making sure that for a given job, there would be one and only one call to 
addTree that returns 0 (otherwise we'll generate twice the differencers in that 
case). The synchronized was to make the 'remove then return length' atomic, 
more than to protect the access to the structure. completedSynchronization does 
have the same problem even though in that case the consequence is benign, we'll 
just print the sync message twice (it's still less good than with synchronized).

Also, without having ran it, I'm not sure the change to the test works, because 
since REMOTE won't ever send a merkle tree, the differencers won't ever be 
generated. But to be honest, this test is really not very useful and I would be 
fine with just removing it instead of faking things to a point where it doesn't 
really test anything.



> Have the repair of a range repair *all* the replica for that range
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-2610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2610
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 0001-Make-repair-repair-all-hosts-v2.patch, 
> 0001-Make-repair-repair-all-hosts.patch, 0002-Cleanup-log-messages-v2.patch, 
> 0003-cleanup-and-fix-private-reference.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Say you have a range R whose replica for that range are A, B and C. If you 
> run repair on node A for that range R, when the repair end you only know that 
> A is fully repaired. B and C are not. That is B and C are up to date with A 
> before the repair, but are not up to date with one another.
> It makes it a pain to schedule "optimal" cluster repairs, that is repairing a 
> full cluster without doing work twice (because you would have still have to 
> run a repair on B or C, which will make A, B and C redo a validation 
> compaction on R, and with more replica it's even more annoying).
> However it is fairly easy during the first repair on A to have him compare 
> all the merkle trees, i.e the ones for B and C, and ask to B or C to stream 
> between them whichever the differences they have. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to