Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15012 )
Change subject: KUDU-3011 p5: transfer leadership when quiescing ...................................................................... Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_peers-test.cc File src/kudu/consensus/consensus_peers-test.cc: http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_peers-test.cc@106 PS3, Line 106: /*server_quiescing*/nullptr, > Nit: inline comment to annotate this? Done http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_queue-test.cc File src/kudu/consensus/consensus_queue-test.cc: http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_queue-test.cc@76 PS3, Line 76: > Would it be useful to add a targeted "unit" test here to check the conditio Not sure how useful it actually is, but done. http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_queue-test.cc@112 PS3, Line 112: metric_entity_, > Same. Done http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_queue.h File src/kudu/consensus/consensus_queue.h: http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/consensus/consensus_queue.h@552 PS3, Line 552: const std::atomic<bool>* server_quie > nit: can this be a const pointer? Done http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/integration-tests/tablet_server_quiescing-itest.cc File src/kudu/integration-tests/tablet_server_quiescing-itest.cc: http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/integration-tests/tablet_server_quiescing-itest.cc@314 PS3, Line 314: const auto& ts_uuid = ts_and_details.first; : if (ts_uuid != leader_uuid) { : const auto* ts_details = ts_and_details.second; : ASSERT_OK(DeleteTablet(ts_details, tablet_id, : tablet::TabletDataState::TABLET_DATA_TOMBSTONED, : kTimeout)); : ASSERT_EVENTUALLY([&] { : vector<string> running_tablets; : ASSERT_OK(ListRunningTabletIds(ts_details, kTimeout, &running_tablets)); : ASSERT_EQ(0, running_tablets.size()); : }); : } > Is this stable enough to avoid situations when 2 out of 3 replicas are remo Right, tombstoned voting helps us out here so we can get a majority even though we have two replicas "down". http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/integration-tests/tablet_server_quiescing-itest.cc@346 PS3, Line 346: ASSERT_EVENTUALLY([&] { : ASSERT_EQ(0, leader_ts->server()->num_raft_leaders()->value()); : TServerDetails* new_leader_details; : ASSERT_OK(FindTabletLeader(ts_map_, tablet_id, kTimeout, &new_leader_details)); : ASSERT_NE(leader_uuid, new_leader_details->uuid() > Why a sleep followed by a hard ASSERT? Why not an ASSERT_EVENTUALLY for the Done http://gerrit.cloudera.org:8080/#/c/15012/3/src/kudu/integration-tests/tablet_server_quiescing-itest.cc@512 PS3, Line 512: TServerDetails* leader_details; > How do we know that all 3 tablet servers are not in some sort of transition We chatted about this a bit, though I think this comment was left with some misunderstanding of how quiescing works. We don't step down abruptly, so there is no transitional period or anything. -- To view, visit http://gerrit.cloudera.org:8080/15012 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idbf0716f5c9455f83ff5f6f601b0f5042f77d078 Gerrit-Change-Number: 15012 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 15 Jan 2020 05:02:11 +0000 Gerrit-HasComments: Yes
