[ 
https://issues.apache.org/jira/browse/CASSANDRA-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191245#comment-13191245
 ] 

Vijay commented on CASSANDRA-3721:
----------------------------------

>>>  But it should be doable with 2 lines in RepairJob.addTree(), and maybe a 
>>> few more lines to send the snapshot commands
the problem is that we have to have to implement the same thing which is done 
in DistributedJob(found in the attached patch) the reason being we have to wait 
for the job to complete in the remote server, so we might want to wait for a 
simplecondition and then create a condition for every request sent or callback 
needs to do the next job (special for snapshot repair).
+ we have to do the same thing which we did for sendTree for the Diffrencing 
because it has performStreamingRepair(). 
+ we have to also clear the snapshot if it fails.
+ I thought of implementing CASSANDRA-3486 after this which will benefit from 
this refactor too.

Do you think it is worth doing a simple patch in the lines of what you have 
mentioned for 1.1 and keep the refactor for 1.2?

>>> I spotted 2 changes that seems gratuitous
Those where unintentional i should have checked it before submitting i will fix 
that.
                
> Staggering repair
> -----------------
>
>                 Key: CASSANDRA-3721
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3721
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: 0001-staggering-repair-with-snapshot.patch
>
>
> Currently repair runs on all the nodes at once and causing the range of data 
> to be hot (higher latency on reads).
> Sequence:
> 1) Send a repair request to all of the nodes so we can hold the references of 
> the SSTables (point at which repair was initiated)
> 2) Send Validation on one node at a time (once completed will release 
> references).
> 3) Hold the reference of the tree in the requesting node and once everything 
> is complete start diff.
> We can also serialize the streaming part not more than 1 node is involved in 
> the streaming.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to