[ 
https://issues.apache.org/jira/browse/CASSANDRA-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17442937#comment-17442937
 ] 

David Capwell commented on CASSANDRA-17069:
-------------------------------------------

looking closer at the code I think its a race condition; which also matches as 
there is no throttling in the test which times out.  If streaming 
starts/completes within the gap between seeing nodetool netstats results and 
the next query, then the test keeps looping waiting. 

Going to test this by grepping for "[Stream (.*)?] All sessions completed" when 
no stream was seen yet in nodetool.  If this happens we know we hit the edge 
case and can mark the test as success (stream passed; which is all the test is 
checking).

I am marking this as "not this JIRAs issue" and will file a new issue for it.  
I have a patch locally which detects this race condition to make progress, but 
best to get the original author to take a look to make sure it smells ok.

> Refactor normal/preview/IR repair to standardize repair cleanup and error 
> handling of failed RepairJobs
> -------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17069
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17069
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair
>            Reporter: David Capwell
>            Assignee: David Capwell
>            Priority: Normal
>             Fix For: 4.x
>
>
> Right now we have 3 different implementations of repair: normal, preview, and 
> incremental (IR); all 3 handle RepairJob failures differently and offer 
> different state cleanup.  To make sure that we consistently handle errors the 
> same way and cleanup, we should move these responsibilities outside of the 
> repair task itself and move these into common APIs and move some logic into 
> the repair coordination its self.
> This work relates with CASSANDRA-15399 as special handling each task makes 
> the work more complex.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to