YangSong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14654 )
Change subject: KUDU-2975: Spread WAL across multiple directories ...................................................................... Patch Set 22: (23 comments) http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/fs_manager.cc File src/kudu/fs/fs_manager.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/fs_manager.cc@84 PS22, Line 84: : DEFINE_string(fs_wal_dir, "", : "Directory with write-ahead logs. If this is not specified, the " : "program will not start. May be the same as fs_data_dirs"); : TAG_FLAG(fs_wal_dir, stable); : DEFINE_string(fs_wal_dirs, "", : "Directory with write-ahead logs. If this is not specified, the " : "program will not start. May be the same as fs_data_dirs"); : TAG_FLAG(fs_wal_dirs, stable); > Can you add a group flag validator like ValidateUpdateTabletStatsInterval i I don't know how to define this place. I try to define it as: bool ValidateFsWalDir() { bool has_a = !FLAGS_fs_wal_dir.empty(); bool has_b = !FLAGS_fs_wal_dirs.empty(); if (has_a == has_b) { LOG(ERROR) << "--fs_wal_dir and --fs_wal_dirs only one can be set."; return false; } return true; } GROUP_FLAG_VALIDATOR(fs_wal_dir, ValidateFsWalDir); but I found It's not the concept "group", they are mutually exclusive. If I define like this, the many tests cannot run, such as fs_manager-test. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/fs_manager.cc@735 PS22, Line 735: Status FsManager::GetOrphanedWalDir(const string& tablet_id, string* wal_dir) { : if (wal_dir == nullptr) { : return Status::InvalidArgument("wal_dir cannot be nullptr"); : } : Status s = wd_manager_->FindTabletDirFromDisk(tablet_id, "", wal_dir); : if (!s.ok()) { : LOG(WARNING) << "cannot find tablet:" << tablet_id << " wal directory." << : s.message().ToString(); : } : return s; : } : : Status FsManager::GetOrphanedWalRecoveryDir(const string& tablet_id, : string* wal_recovery_dir) { : if (wal_recovery_dir == nullptr) { : return Status::InvalidArgument("wal_recovery_dir cannot be nullptr"); : } : Status s = wd_manager_->FindTabletDirFromDisk(tablet_id, kWalsRecoveryDirSuffix, : wal_recovery_dir); : if (!s.ok()) { : LOG(WARNING) << "cannot find tablet:" << tablet_id << " wal directory." << : s.message().ToString(); : } : return s; : } : > As written, I think these mean that: Yes, because we delete metadata first then delete log when delete tablet, so maybe have some orphaned WAL dir. We should find this orphaned WAL dir then delete them at the next time it starts. When we determine if there is tablet WAL data, we look at the tablet's metadata first then look it up on the WAL dirs. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc File src/kudu/fs/wal_dirs.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@75 PS22, Line 75: DEFINE_bool(fs_wal_dirs_consider_available_space, true, : "Whether to consider available space when selecting a wal " : "directory during tablet or wal block creation."); : TAG_FLAG(fs_wal_dirs_consider_available_space, runtime); : TAG_FLAG(fs_wal_dirs_consider_available_space, evolving); > This isn't used, is it? Yes. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@605 PS22, Line 605: r *wal_ > nit: move the * so it's next to the type name, i.e. OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@614 PS22, Line 614: if (other != nullptr) { > nit: this can be written as: OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@616 PS22, Line 616: "tried to load directory for tablet $0 but one is already registered", : tablet_id)); > nit: maybe also log the existing directory? OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@664 PS22, Line 664: postfix > nit: we usually call this a "suffix" OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@675 PS22, Line 675: if (!postfix.empty()) { : tablet_path.append(postfix); : } > We don't need to condition on postfix.empty(). If it's empty, tablet_path.a OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@710 PS22, Line 710: if (tablets_by_uuid.second.size() < min_tablets_num) { : min_tablets_num = tablets_by_uuid.second.size(); : target_uuid = tablets_by_uuid.first; : } > The fact that we're always using the least-populated directory is different Yes, The current algorithm is relatively simple, just make sure that each WAL directory has the same number of tablets. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/fs/wal_dirs.cc@808 PS22, Line 808: : bool WalDirManager::IsWalDirFailed(const string& uuid) const { : shared_lock<rw_spinlock> lock(dir_lock_.get_lock()); : return ContainsKey(failed_wal_dirs_, uuid); : } > This isn't used, right? Same with MarkWalDirFailedByUuid yes.MarkWalDirFailedByUuid is not used right now, may be used in the future in case of disk failure. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tablet/tablet_metadata-test.cc File src/kudu/tablet/tablet_metadata-test.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tablet/tablet_metadata-test.cc@47 PS22, Line 47: class TestTabletMetadata : public KuduTabletTest { : public: : TestTabletMetadata() : : KuduTabletTest(GetSimpleTestSchema()) { : } : : virtual void SetUp() OVERRIDE { : KuduTabletTest::SetUp(); : writer_.reset(new LocalTabletWriter(harness_->tablet().get(), : &client_schema_)); : } : : void BuildPartialRow(int key, int intval, const char* strval, : gscoped_ptr<KuduPartialRow>* row); : : protected: : gscoped_ptr<LocalTabletWriter> writer_; : }; > TODO(awong): can we also add a test that, if we load a superblock that _doe OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tablet/tablet_metadata.cc File src/kudu/tablet/tablet_metadata.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tablet/tablet_metadata.cc@463 PS22, Line 463: // An error loading the WAL dir is non-fatal, it just means the : // tablet will fail to bootstrap later. > TODO(awong): make sure this is true I will add some test. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/kudu-tool-test.cc@5219 PS22, Line 5219: // master use "fs_wal_dir", tserver use "fs_wal_dirs". > Is there a good reason to do this here? I know that in a real deployment, w In the ExternalMiniCluster, master keep using "fs_wal_dir", because master has only one tablet, using "fs_wal_dirs" is pointless. And tserver change to use "fs_wal_dirs". I will change the ExternalMiniCluster to always use fs_wal_dirs. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/tool_action_fs.cc File src/kudu/tools/tool_action_fs.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/tool_action_fs.cc@342 PS22, Line 342: // data dir group for the tablet, and the staged directory config won't : // affect this tablet. : DCHECK(s1.IsNotFound() || s2.IsNotFound()) << s1.ToString(); : continue; : } : RETURN_NOT_OK_PREPEND(s1, "at least one tablet is configured to use removed data directory. " : "Retry with --force to override this"); : RETURN_NOT_OK_PREPEND(s2, "at least one tablet is configured to use removed WAL directory. " : "Retry with --force to override this"); : } > TODO(awong): is this tested I tested it by hand. I will add some test code. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/tool_action_local_replica.cc File src/kudu/tools/tool_action_local_replica.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/tool_action_local_replica.cc@367 PS22, Line 367: // For new version, need load metadata, find directory from metadata. But we just find from : // wal directory first. : RETURN_NOT_OK(fs_manager.wd_manager()->FindAndRegisterWalDir(tablet_id)); > Doesn't this already happen in TabletMetadata::Load()? Yes. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tools/tool_action_local_replica.cc@524 PS22, Line 524: // For new version, need load metadata, find directory from metadata. : // But we just find the wal dir for tablet from disk first. : RETURN_NOT_OK(fs_manager->wd_manager()->FindAndRegisterWalDir(tablet_id)); > Should we call TabletMetadata::Load() instead? Yes. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/mini_tablet_server-test.cc File src/kudu/tserver/mini_tablet_server-test.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/mini_tablet_server-test.cc@41 PS22, Line 41: : TEST_F(MiniTabletServerTest, TestMultiDataDirServer) { : // Specifying the number of data directories will create subdirectories under the test root. : unique_ptr<MiniTabletServer> mini_server; : FsManager* fs_manager; : : int kNumDataDirs = 3; : mini_server.reset(new MiniTabletServer(GetTestPath("TServer"), : HostPort("127.0.0.1", 0), kNumDataDirs)); : ASSERT_OK(mini_server->Start()); : fs_manager = mini_server->server()->fs_manager(); : ASSERT_EQ(1, fs_manager->GetWalRootDirs().size()); : ASSERT_STR_CONTAINS(DirName(fs_manager->GetWalRootDirs()[0]), "wal"); : ASSERT_EQ(kNumDataDirs, fs_manager->GetDataRootDirs().size()); : } : : TEST_F(MiniTabletServerTest, TestMultiWalDirServer) { : // Specifying the number of data directories will create subdirectories under the test root. : unique_ptr<MiniTabletServer> mini_server; : FsManager* fs_manager; : : int kNumWalDirs = 3; : mini_server.reset(new MiniTabletServer(GetTestPath("TServer"), : HostPort("127.0.0.1", 0), 1, kNumWalDirs)); : ASSERT_OK(mini_server->Start()); : fs_manager = mini_server->server()->fs_manager(); : ASSERT_EQ(1, fs_manager->GetDataRootDirs().size()); : ASSERT_STR_CONTAINS(DirName(fs_manager->GetDataRootDirs()[0]), "data"); : ASSERT_EQ(kNumWalDirs, fs_manager->GetWalRootDirs().size()); : } > nit: maybe combine these tests so we have a single test that has multiple W OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/mini_tablet_server.cc File src/kudu/tserver/mini_tablet_server.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/mini_tablet_server.cc@76 PS22, Line 76: // This place needs to be consistent with the original logic. : // Otherwise, some test cases will not pass. : // If there just one wal dir and data dir, we just use fs_root. > nit: Thanks for the note! This comment should be removed before we merge th OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_copy_client-test.cc File src/kudu/tserver/tablet_copy_client-test.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_copy_client-test.cc@507 PS22, Line 507: fs_manager_->wd_manager()->CreateWalDir(GetTabletId()); > Why do we need this? Shouldn't this have already happened when we setup the We need prepare a WAL dir for tablet. when we setup the test, we just init the fs_manager. this is done in StartCopy(). This case has not used "StartCopy()", so we need do it here. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_server-stress-test.cc File src/kudu/tserver/tablet_server-stress-test.cc: PS22: > Instead of copying the existing test, how about making this a parameterized OK http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_server-test.cc File src/kudu/tserver/tablet_server-test.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_server-test.cc@206 PS22, Line 206: /*num_data_dirs=*/1, /*num_wal_dirs=*/3 > Why not leave it using the default of 1 data directory and 1 WAL directory? Here I just want to test multiple WAL dirs. I can change back if it's not necessary. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/tablet_server-test.cc@3735 PS22, Line 3735: string tablet_meta_path = JoinPathSegments(JoinPathSegments( : GetTestPath("TabletServerTest-fsroot"), "wal-0"), "tablet-meta"); > I'm kind of surprised by this. Why did the metadata directory change? Because there are 3 WAL directories, in function MiniTabletServer::MiniTabletServer(), we named the wal directory name as "wal-0" "wal-1" "wal-2", in fs_manager, the metadata directory use the first wal directory, so here the metadata directory name change to "wal-0". The original code had only one wal directory and one data directory, so it use "TabletServerTest-fsroot" as the wal and metadata directory. http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/ts_tablet_manager.cc File src/kudu/tserver/ts_tablet_manager.cc: http://gerrit.cloudera.org:8080/#/c/14654/22/src/kudu/tserver/ts_tablet_manager.cc@1460 PS22, Line 1460: meta->fs_manager()->GetTabletWalDir(tablet_id, &wal_dir); : meta->fs_manager()->GetTabletWalRecoveryDir(tablet_id, &wal_recovery_dir); > Should we WARN_NOT_OK here? Or maybe we should check to look if the status yes, we should WARN_NOT_OK. there is a case, if metadata deletion succeed but log deletion failed, we cannot get the tablet's dir, but Log::DeleteOnDiskData() also can delete it by finding the tablet's dir from disk. So I think here just need a warning. -- To view, visit http://gerrit.cloudera.org:8080/14654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ied496804421d91ff1fa63d49979fde971071506e Gerrit-Change-Number: 14654 Gerrit-PatchSet: 22 Gerrit-Owner: YangSong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: YangSong <[email protected]> Gerrit-Comment-Date: Thu, 12 Dec 2019 03:20:45 +0000 Gerrit-HasComments: Yes
