Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10540 )

Change subject: [tools] extra integration tests for the rebalancer
......................................................................


Patch Set 11:

Thank you for the review.

 > (4 comments)
 >
 > What did you learn after looping the new tests?

I learned that there should be some additional handling of transient errors 
related to RaftConsensus in the tool, e.g. failures related to CAS, connection 
refused (when a tablet server shuts down), and backpressure-related errors.  
Otherwise, it's necessary to restart the tool once such an error happens, but 
it would be nice if the tool handle that automatically and exit only when the 
cluster is balanced or some non-transient error happens.

The absence of handling of those transient errors lead to some flakiness in 
particular scenarios if running with --stress-cpu-threads=16, but the rate of 
failure is low (1/50 or so).

Not a big deal, but I'm thinking of adding corresponding code for that.


--
To view, visit http://gerrit.cloudera.org:8080/10540
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78b3dcea71ed303f6ecd199604b2385796d05da8
Gerrit-Change-Number: 10540
Gerrit-PatchSet: 11
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Tue, 03 Jul 2018 22:02:24 +0000
Gerrit-HasComments: No

Reply via email to