Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11107 )

Change subject: [tools] run rebalancer during 'election storm'
......................................................................


Patch Set 1:

(4 comments)

I'm guessing you already did, but if you haven't can you run this a bunch of 
times with dist-test to check it's not flaky, and add some stats about that to 
the commit msg?

http://gerrit.cloudera.org:8080/#/c/11107/1/src/kudu/tools/kudu-admin-test.cc
File src/kudu/tools/kudu-admin-test.cc:

http://gerrit.cloudera.org:8080/#/c/11107/1/src/kudu/tools/kudu-admin-test.cc@2086
PS1, Line 2086: #if defined(ADDRESS_SANITIZER) || defined(THREAD_SANITIZER)
              :   const auto timeout = MonoDelta::FromSeconds(5);
              : #else
              :   const auto timeout = MonoDelta::FromSeconds(10);
I think you swapped the timeouts here because the sanitizer builds should have 
longer timeouts.


http://gerrit.cloudera.org:8080/#/c/11107/1/src/kudu/tools/kudu-admin-test.cc@2093
PS1, Line 2093: auto max_sleep_ms = 2.0;
Can you explain where this number comes from? Obviously it's super low compared 
to the expected time to an election start from a leader stepdown in an idle 
cluster of about 1.25 seconds, but the Raft heartbeat times are changed for the 
tests. A couple sentences explaining the initial setting and backoff would be 
good.


http://gerrit.cloudera.org:8080/#/c/11107/1/src/kudu/tools/kudu-admin-test.cc@2109
PS1, Line 2109: for (const auto& tablet : tablets) {
With RF=3, running over each TS and replica will cause 3 elections to be called 
on each tablet. Is that what you want? Or do you want one per tablet per cycle 
through all the tablets?


http://gerrit.cloudera.org:8080/#/c/11107/1/src/kudu/tools/kudu-admin-test.cc@2164
PS1, Line 2164: usually happens because GetConsensusState requests are dropped 
due to
              :     // backpressure
As you mentioned to me in person, we should fix ksck to be resilient to this up 
to some timeout.



--
To view, visit http://gerrit.cloudera.org:8080/11107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic98684dbe55049bbc411513faa0b6bbaef20f434
Gerrit-Change-Number: 11107
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Thu, 02 Aug 2018 18:15:39 +0000
Gerrit-HasComments: Yes

Reply via email to