Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10540 )
Change subject: [tools] extra integration tests for the rebalancer ...................................................................... Patch Set 11: Thank you for the review. > (4 comments) > > What did you learn after looping the new tests? I learned that there should be some additional handling of transient errors related to RaftConsensus in the tool, e.g. failures related to CAS, connection refused (when a tablet server shuts down), and backpressure-related errors. Otherwise, it's necessary to restart the tool once such an error happens, but it would be nice if the tool handle that automatically and exit only when the cluster is balanced or some non-transient error happens. The absence of handling of those transient errors lead to some flakiness in particular scenarios if running with --stress-cpu-threads=16, but the rate of failure is low (1/50 or so). Not a big deal, but I'm thinking of adding corresponding code for that. -- To view, visit http://gerrit.cloudera.org:8080/10540 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78b3dcea71ed303f6ecd199604b2385796d05da8 Gerrit-Change-Number: 10540 Gerrit-PatchSet: 11 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Tue, 03 Jul 2018 22:02:24 +0000 Gerrit-HasComments: No
