YangSong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14654 )

Change subject: [KUDU-2975]Spread WAL across multiple directories
......................................................................


Patch Set 3:

(2 comments)

This modification is general. So I haven't written test code yet. And there are 
still these problems.
1. during switch "fs_wal_dir" to "fs_wal_dirs", not compatible.
2. WAL disk failure not support.
3. Tool about wal not modify.
4. Lack tests.
I agree that the implementation for WAL manager is much simpler than data 
manager. They do almost same things in initialization, make dir and create 
'block_manager_instance'. This is their biggest commonality. And I find some 
thing wrong in DataDirManager::Create().
Having a separate class for WAL manager is my original idea. But now I'm not 
sure which method is better.

http://gerrit.cloudera.org:8080/#/c/14654/3/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/14654/3/src/kudu/tserver/tablet_server-test.cc@2453
PS3, Line 2453:     const int kLimit = rand() % (kMaxLimit + 1) + 1;
> Why did this change?
Here 'kMaxLimit' may be 0, then an exception reported like "Floating point 
exception".


http://gerrit.cloudera.org:8080/#/c/14654/3/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14654/3/src/kudu/tserver/ts_tablet_manager.cc@1508
PS3, Line 1508: void TSTabletManager::FailTabletsInDataDir(const string& uuid) {
              :   DataDirManager* dir_manager = nullptr;
              :   int uuid_idx;
              :   if (fs_manager_->dd_manager()->FindUuidIndexByUuid(uuid, 
&uuid_idx)) {
              :     dir_manager = fs_manager_->dd_manager();
              :   } else if 
(fs_manager_->wd_manager()->FindUuidIndexByUuid(uuid, &uuid_idx)) {
              :     dir_manager = fs_manager_->wd_manager();
              :   } else {
              :     LOG(FATAL) << Substitute("No directory found with UUID $0", 
uuid);
              :   }
              :   if (dir_manager->IsDataDirFailed(uuid_idx)) {
              :     LOG(WARNING) << "Data directory is already marked failed.";
              :     return;
              :   }
              :   // Fail the directory to prevent other tablets from being 
placed in it.
              :   dir_manager->MarkDataDirFailed(uuid_idx);
              :   set<string> tablets = 
dir_manager->FindTabletsByDataDirUuidIdx(uuid_idx);
              :   LOG(INFO) << Substitute("Data dir $0 has $1 tablets", uuid, 
tablets.size());
              :   for (const string& tablet_id : 
dir_manager->FindTabletsByDataDirUuidIdx(uuid_idx)) {
              :     FailTabletAndScheduleShutdown(tablet_id);
              :   }
              : }
> Rather than reusing this exactly, I'd prefer we NOT tightly couple the hand
Here I still don't understand how to keep a single FsErrorManager. In the 
initialization we have two FsErrorManager for disk error, the error handle is 
MarkDataDirFailedByUuid. After starting we have just one FsErrorManager, the 
error handle is FailTabletsInDataDir. Here uuid maybe in two managers, WAL 
manager and data manager.



--
To view, visit http://gerrit.cloudera.org:8080/14654
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied496804421d91ff1fa63d49979fde971071506e
Gerrit-Change-Number: 14654
Gerrit-PatchSet: 3
Gerrit-Owner: YangSong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <[email protected]>
Gerrit-Comment-Date: Fri, 08 Nov 2019 03:54:51 +0000
Gerrit-HasComments: Yes

Reply via email to