Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11981 )

Change subject: [raft_consensus-itest] fix flake in Test_KUDU_1735
......................................................................


Patch Set 2:

(9 comments)

Thank you for the review!

http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/cluster_itest_util.h@138
PS2, Line 138: consensus::OpIdType op_id_type = consensus::RECEIVED_OPID
> nit: doc for this?
Done


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2468
PS2, Line 2468:   ASSERT_EQ(3, tablet_servers_.size());
> Not a question about this patch specifically, but what's the point of testi
The point is that the 3-4-3 scheme have to work with 3 servers as well: there 
isn't a way currently to enforce users to have more than 3 tablet servers.  
Also, in case of 3 servers the 3-4-3 scheme successfully replaces replicas that 
fail with FAILED_UNRECOVERABLE state.


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2477
PS2, Line 2477: // Wait for at least one operation  committed in current term 
by the leader.
              :   // Otherwise, the RemoveServer() below would end up with 
IllegalState error:
              :   // 'Leader has not yet committed an operation in its own 
term'.
> nit: this is helpful insofar as it describes what was failing before, but I
Done


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2500
PS2, Line 2500: affected_servers
> nit: "affected" is pretty vague, so it's not as obvious as it could be in d
Done


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2512
PS2, Line 2512:
> nit: not the fault of this patch, but could you take this extra space out?
Done


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2517
PS2, Line 2517: In case of the 3-4-3 scheme, the catalog manager adds the 
evicted replica
              :     // back as a non-voter, and there might be two possible 
outcomes.
              :     // Particularly, after the tablet's Raft configuration has 
been updated,
              :     // the catalog manager sends DeleteTablet() request to the 
corresponding
              :     // tablet server upon receiving updated tablet report from 
leader replica.
> I think the ordering of these sentences is a bit awkward. I was expecting t
Indeed -- it seems after multiple updates it became messy.  I updated this 
piece as you suggested and added a bit more information which I think brings 
even more clarity.


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2522
PS2, Line 2522: 1. If the DeleteTablet() request succeeds before the leader 
replica
              :     //      receives ADD_PEER configuration follow-up update 
initiated by the
              :     //      catalog manager and re-discovers the evicted 
replica as a new peer
              :     //      with the LMP_MISMATCH status, the configuration 
update to add
              :     //      a new non-voter replica will stay pending since 
only one voter
              :     //      replica (the leader replica itself) is alive.
> Is this LMP_MISMATCH information relevant to the outcome itself? Maybe this
Right, the LMP_MISMATCH is not relevant in this case.  The comment has been 
updated, thanks!


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2536
PS2, Line 2536: MonoDelta::FromSeconds(5)
> Why not kTimeout?
Because it's not necessary to wait for that long: the operations are being 
committed pretty soon, and in case of the 'outcome 2' there would be 5 extra 
seconds if using 10 seconds instead of 5 here.


http://gerrit.cloudera.org:8080/#/c/11981/2/src/kudu/integration-tests/raft_consensus-itest.cc@2549
PS2, Line 2549: decltype(leader_tserver) srvs[] =
> Is it possible to declare this as vector<decltype(leader_tserver)>? If so,
Done



--
To view, visit http://gerrit.cloudera.org:8080/11981
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If44ad0e8363a3aead409484cff68843f1e5d6b6d
Gerrit-Change-Number: 11981
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Comment-Date: Mon, 26 Nov 2018 05:02:34 +0000
Gerrit-HasComments: Yes

Reply via email to