YangSong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14654 )

Change subject: KUDU-2975: Spread WAL across multiple directories
......................................................................


Patch Set 13:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs.proto
File src/kudu/fs/fs.proto:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs.proto@58
PS13, Line 58: // A filesystem instance can contain multiple paths. One of 
these structures
             : // is persisted in each path when the filesystem instance is 
created.
             : message PathInstanceMetadataPB {
             :   // Describes this path instance as well as all of the other 
path instances
             :   // that, taken together, describe a complete set.
             :   required PathSetPB path_set = 1;
             :
             :   // Textual representation of the block manager that formatted 
this path.
             :   required string block_manager_type = 2;
             :
             :   // Block size on the filesystem where this instance was 
created. If the
             :   // instance (and its data) are ever copied to another 
location, the block
             :   // size in that location must be the same.
             :   required uint64 filesystem_block_size_bytes = 3;
             : }
> I'm not convinced we have to reuse this either, especially if we end up not
If we don't neet to do the consistency check for WAL directories, we don't need 
to record the UUIDs of all WAL directories. If the "wal_manager_instance" file 
is not accessible, we can't know the UUID of this directory any more. We must 
make a new UUID for it. Maybe we just need to restore the file with the 
original UUID.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.h
File src/kudu/fs/fs_manager.h:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.h@98
PS13, Line 98:   // The directory root where WALs will be stored. Cannot be 
empty.
             :   std::vector<std::string> wal_roots;
> Rather than keeping track of two sets of WAL roots (wal_root _and_ wal_root
Yes, we can put them in "wal_roots". Just make sure that only one gflag is used.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc
File src/kudu/fs/fs_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc@195
PS13, Line 195:   if (!opts_.wal_root.empty() && !opts_.wal_roots.empty()) {
              :     return Status::IOError("Write-ahead log directory 
(fs_wal_dir and fs_wal_dirs)"
              :         " provided conflictly");
              :   }
> nit: I think this would be nicer as a gflag validator.
I don't know how to use gflag validator. Here use "FLAGS_fs_wal_dir" and 
"FLAGS_fs_wal_dirs" instead of "opts_.wal_root" and "opts_.wal_roots"?


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc@392
PS13, Line 392:   if (opts_.consistency_check != 
ConsistencyCheckBehavior::UPDATE_ON_DISK) {
> Why this check?
If we use "fs_wal_dirs" instead of "fs_wal_dir", maybe we add a WAL dir, the 
new dir has not a subdir named "wals".  The following code will report an error 
when we do "kudu fs update_dirs".


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc@474
PS13, Line 474:     report->wal_dir = canonicalized_wal_fs_roots_[0].path;
> We should probably allow the FsReport to accept multiple WAL directories to
Yes


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc@534
PS13, Line 534:   vector<string> ancillary_dirs = JoinPathSegmentsV(
              :       WalDirManager::GetRootNames(canonicalized_wal_fs_roots_), 
kWalDirName);
> How about let's have the WalDirManager manage the creation of WAL directori
Yes, it's required.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/fs_manager.cc@735
PS13, Line 735:
              : string FsManager::GetTabletWalDir(const std::string& tablet_id) 
const {
              :   string dir;
              :   Status s = wd_manager_->FindWalDirByTabletId(tablet_id, &dir);
              :   if (!s.ok()) {
              :     LOG(WARNING) << "cannot find tablet:" << tablet_id << " wal 
directory." <<
              :                  s.message().ToString();
              :     return "";
              :   }
              :   return JoinPathSegments(dir, tablet_id);
              : }
> I think we'll eventually want to have this return a Status and have a strin
Yes


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/wal_dirs.cc
File src/kudu/fs/wal_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/wal_dirs.cc@436
PS13, Line 436: consistency_check
> I'm not convinced that we need the consistency checking behavior for WALs.
this is for "kudu fs update_dirs" tool.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/fs/wal_dirs.cc@605
PS13, Line 605:   std::lock_guard<percpu_rwlock> write_lock(dir_lock_);
> We only need to hold the lock in read mode for ContainsKey(), and we don't
OK


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tablet/tablet_bootstrap-test.cc
File src/kudu/tablet/tablet_bootstrap-test.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tablet/tablet_bootstrap-test.cc@121
PS13, Line 121:     fs_manager_->wd_manager()->DeleteWalDir(log::kTestTablet);
> Why did we have to do this?
This class is based on "LogTestBase", in "LogTestBase" SetUp() function, I add 
a line code like:
ASSERT_OK(fs_manager_->wd_manager()->CreateWalDir(kTestTablet));
so here I first delete the tablet's WAL dir message in WalDirManager. Let it 
created or loaded by "TabletMetadata::LoadOrCreate()".


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tools/tool_action_local_replica.cc
File src/kudu/tools/tool_action_local_replica.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tools/tool_action_local_replica.cc@524
PS13, Line 524: CreateWalDir
> Aren't we operating on an existing tablet replica? Why are we creating a br
Yes, There's a problem here. I should load the tablet's metadata, and get the 
WAL dir from metedata, if cannot found, I should use "TryCreateWalDir". I will 
modify it later. Above is the same.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tserver/tablet_copy_client-test.cc
File src/kudu/tserver/tablet_copy_client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tserver/tablet_copy_client-test.cc@298
PS13, Line 298:   fs_manager_->wd_manager()->CreateWalDir(GetTabletId());
> Why are we creating a WAL directory from scratch here? Shouldn't the WAL di
Yes, I will modify it later.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/tserver/tablet_server-test.cc@2453
PS13, Line 2453:     const int kLimit = rand() % (kMaxLimit + 1) + 1;
> This isn't related to this patch, right? If not, could you put it in a sepa
Yes, not related to this patch.


http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/util/env_util.cc
File src/kudu/util/env_util.cc:

http://gerrit.cloudera.org:8080/#/c/14654/13/src/kudu/util/env_util.cc@362
PS13, Line 362: Status ListDirsInDir(Env* env,
              :                      const string& path,
              :                      vector<string>* entries) {
              :   RETURN_NOT_OK(env->GetChildren(path, entries));
              :   auto iter = entries->begin();
              :   while (iter != entries->end()) {
              :     if (*iter == "." || *iter == ".." || iter->find(kTmpInfix) 
!= string::npos) {
              :       iter = entries->erase(iter);
              :       continue;
              :     }
              :     bool is_dir = false;
              :     Status s = env->IsDirectory(JoinPathSegments(path, *iter), 
&is_dir);
              :     if (!s.ok() || !is_dir) {
              :       iter = entries->erase(iter);
              :       continue;
              :     }
              :     ++iter;
              :   }
              :   return Status::OK();
              : }
              :
> We don't need this anymore, right?
It's used in "mini_cluster_fs_inspector.cc". Go through all the files under the 
"wals" directory to get all tablets in old version, but new version we should 
go through the dirs, because we add a "wal_manager_instance" file under wals 
dir.



--
To view, visit http://gerrit.cloudera.org:8080/14654
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied496804421d91ff1fa63d49979fde971071506e
Gerrit-Change-Number: 14654
Gerrit-PatchSet: 13
Gerrit-Owner: YangSong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <[email protected]>
Gerrit-Comment-Date: Fri, 15 Nov 2019 09:28:43 +0000
Gerrit-HasComments: Yes

Reply via email to