Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16830 )
Change subject: [master][consensus] Procedure for copying system catalog ...................................................................... Patch Set 4: (10 comments) http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG@28 PS4, Line 28: One functional change nit: it seems this patch should focus more on this aspect, rather than being touted as a test change, since this won't a test-only thing if we expect to use the new flag in a multi-master recovery. Or is the idea to get the test working and then iron out the approach for status messages? http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc File src/kudu/consensus/consensus_queue.cc: http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc@74 PS4, Line 74: : DEFINE_bool(consensus_allow_status_msg_for_failed_peer, false, : "Allows status only Raft messages to be sent to a peer in FAILED_UNRECOVERABLE state."); : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, hidden); : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, unsafe); Do we actually intend on using this flag for our documented procedure to recover masters? If so, it shouldn't be unsafe. Also mark it runtime? Alternatively, would it make sense to pass some option to the Raft implementation from the master that enables this? http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc File src/kudu/master/dynamic_multi_master-test.cc: http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@139 PS4, Line 139: void StartClusterWithSysCatalogGCed(vector<HostPort>* master_hps, I seem to recall discussing the need to GC in order to have the replica _not_ be catch-up-able through the WALs. But I don't recall why not just use an internal mini cluster and call the GC functions directly? I suppose there might be issues with --master_support_change_config, but it isn't an obvious blocker to using an IMC? I don't feel strongly about it to request a rewrite, but I'm curious what the rationale is since it seems like it might simplify the tests. http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@154 PS4, Line 154: CHECK_OK nit: ASSERT_OK? http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@171 PS4, Line 171: int64_t time_left_to_sleep_msecs = 2000; nit: more idiomatic to use a MonoTime as a deadline and use MonoTime::Now() http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@177 PS4, Line 177: ASSERT_GT(time_left_to_sleep_msecs, 0) : << "Timed out waiting for system catalog WAL to be GC'ed"; nit: how about assert on the GC count? Since, as long as we GCed, it's not a big deal if we just missed the timeout? http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@343 PS4, Line 343: ASSERT_EVENTUALLY([&] { nit: I typically find it cleaner to have callers do the ASSERT_EVENTUALLY() and have test methods like this be single-try functions. We could even reuse the single-try variant if we do have stronger guarantees. As is, it imposes some assumptions about how the method will be used (i.e. that it's going to likely take multiple tries). http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@416 PS4, Line 416: LOG(INFO) << "Verifying the first table"; : // Verify the table created before adding the new master. : shared_ptr<KuduTable> table; : ASSERT_OK(client->OpenTable(kTableName, &table)); : KuduScanner scanner1(table.get()); : size_t row_count = 0; : ASSERT_OK(client::CountRowsWithRetries(&scanner1, &row_count)); : ASSERT_EQ(0, row_count); : : LOG(INFO) << "Creating and verifying the second table"; : // Perform an operation that requires replication to masters. : ASSERT_OK(CreateTable(&migrated_cluster, "second_table")); : ASSERT_OK(client->OpenTable("second_table", &table)); : KuduScanner scanner2(table.get()); : ASSERT_OK(client::CountRowsWithRetries(&scanner2, &row_count)); : ASSERT_EQ(0, row_count); We're still not necessarily testing that the new master is caught up with the rest of the masters here, since the OpenTable() request and GetTableLocations() requests go through whichever master is elected leader, which may or may not be the new master. Another thought is to use a ClusterVerifier, which runs ksck and runs some scans, which seems like a superset of what's done here. http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@433 PS4, Line 433: LOG(INFO) << "Pausing and resuming individual masters"; nit: move this into the if block? http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@588 PS4, Line 588: SKIP_IF_SLOW_NOT_ALLOWED(); : : vector<HostPort> master_hps; : NO_FATALS(StartClusterWithSysCatalogGCed(&master_hps, : {"--consensus_allow_status_msg_for_failed_peer"})); : ASSERT_OK(CreateTable(cluster_.get(), kTableName)); : // Bring up the new master and add to the cluster. : master_hps.emplace_back(reserved_hp_); : NO_FATALS(StartNewMaster(master_hps)); : ASSERT_OK(AddMasterToCluster(reserved_hp_)); : : // Newly added master will be added to the master Raft config but won't be caught up : // from the WAL and hence remain as a NON_VOTER. : ListMastersResponsePB list_resp; : NO_FATALS(RunListMasters(&list_resp)); : ASSERT_EQ(orig_num_masters_ + 1, list_resp.masters_size()); : : for (const auto& master : list_resp.masters()) { : ASSERT_EQ(1, master.registration().rpc_addresses_size()); : auto hp = HostPortFromPB(master.registration().rpc_addresses(0)); : if (hp == reserved_hp_) { : ASSERT_EQ(consensus::RaftPeerPB::NON_VOTER, master.member_type()); : ASSERT_TRUE(master.role() == consensus::RaftPeerPB::LEARNER); : } : } : : // Double check by directly contacting the new master. : NO_FATALS(VerifyNewMasterDirectly({ consensus::RaftPeerPB::LEARNER }, : consensus::RaftPeerPB::NON_VOTER)); : : // Verify new master is in FAILED_UNRECOVERABLE state. : NO_FATALS(VerifyNewMasterInFailedUnrecoverableState()); If this is for the most part a copy TestAddMasterCatchupFromWALNotPossible, why not just have this test supersede it? We could even set --consensus_allow_status_msg_for_failed_peer at runtime if the idea is to test when --consensus_allow_status_msg_for_failed_peer=false. -- To view, visit http://gerrit.cloudera.org:8080/16830 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I142c1dec442ec72c38c5be9d62cdf270e441d6e3 Gerrit-Change-Number: 16830 Gerrit-PatchSet: 4 Gerrit-Owner: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Sat, 12 Dec 2020 01:28:35 +0000 Gerrit-HasComments: Yes
