Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19327 )

Change subject: [Tool] KUDU-3318 Compact container metadata file manually
......................................................................


Patch Set 8:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.h@215
PS8, Line 215:   static bool ValidateContainerFile(Env* env, const std::string& 
container_path,
Add some comments for this function.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@957
PS8, Line 957: Status LogBlockContainer::ProcessRecordsAndGetOffset(
It's similar to LogBlockContainerNativeMeta::ProcessRecords(), how about reuse 
it?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2789
PS8, Line 2789: total_blocks == 0 ? 1 :
Just skip the compact procedure if compact_metadata is 0.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2791
PS8, Line 2791: FLAGS_log_container_live_metadata_before_compact_ratio
Add this parameter to compact_metadata tool, then --help will print it.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2801
PS8, Line 2801:     RETURN_NOT_OK(GenerateOffsetFile(env, dir, container_name, 
offset));
I guess the .compacted is needed to be deleted if the .offset file generate 
failed, right?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2930
PS8, Line 2930:   const string& backup_metadata_file = StrCat(metadata_file, 
kMetadataBackupFileSuffix);
              :   // Delete exsited metadata backup file.
              :   if (env->FileExists(backup_metadata_file)) {
              :     RETURN_NOT_OK_PREPEND(env->DeleteFile(backup_metadata_file),
              :                           "delete metadatda backup file 
failed");
              :   }
              :
              :   const string& offset_file = 
StrCat(JoinPathSegments(dir->dir(), container_name),
              :                                      kMetadataOffsetFileSuffix);
              :   // Delete exsited offset file.
              :   if (env->FileExists(offset_file)) {
              :     RETURN_NOT_OK_PREPEND(env->DeleteFile(offset_file), "delete 
.offset file failed");
              :   }
              :
              :   const string& compacted_file = 
StrCat(JoinPathSegments(dir->dir(), container_name),
              :                                        
kMetadataCompactedFileSuffix);
              :   // Delete exsited compacted file.
              :   if (env->FileExists(compacted_file)) {
              :     RETURN_NOT_OK_PREPEND(env->DeleteFile(compacted_file), 
"delete .compacted file failed");
              :   }
There are a bit of duplicate, how about adding the suffixes to a list, and then 
iterate the list to delete the files?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2987
PS8, Line 2987: Status LogBlockManager::GenerateCompactedFile(Env* env, Dir* 
dir, const string& container_name,
How about do a small refactor for LogBlockManager::RewriteMetadataFile() and 
reuse it?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@3028
PS8, Line 3028: GetContainerNames
There is a similar code piece in LogBlockManager::OpenDataDir(), is it possible 
to reuse it?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@151
PS8, Line 151: value == "print" || value == "compact" ||
             :         value == "merge" || value == "clear"
nit: use a std::set/std::unordered_set for fast lookup, and JoinStrings can be 
used to construct message below.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1036
PS8, Line 1036: 1
Why it's 1 at the beginning? The same to cur_finished_containers_count and 
finished_containers_count.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1033
PS8, Line 1033: atomic<uint64_t> total_blocks = 0;
              :   atomic<uint64_t> total_live_blocks = 0;
              :   atomic<bool> seen_fatal_error = false;
              :   uint64_t finished_directories_count = 1;
              :   uint64_t total_directories = 
fs_manager.dd_manager()->dirs().size();
nit: Move them closer to the place where use it.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1040
PS8, Line 1040: unique_ptr<Dir>
nit: auto


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1050
PS8, Line 1050: unique_ptr<Dir>
nit: auto


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1056
PS8, Line 1056:     Status s = LogBlockManager::GetContainerNames(env, 
dd->dir(), &container_names);
The container_names has been obtained in line 1043, how about reuse it?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1061
PS8, Line 1061: // Check whether the number of metadata files is equal to the 
data files's.
              :     // If the two is not the same, tablet server can not be 
started. It is no need
              :     // to do compacting operations.
While the tserver is running, the container_names get in previous step may be 
outdate, the data file and metadata file may be deleted already here. How about 
just skip the container if finding this kind of "inconsistent"?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1073
PS8, Line 1073: container_name, &total_blocks, &total_live_blocks,
              :                                   &total_directories, 
finished_directories_count,
              :                                   finished_containers_count, 
container_names,
              :                                   &seen_fatal_error, 
cur_finished_containers_count,
              :                                   total_containers, &dd, &env
The parameter list is too long, how about encapsulating some counters into a 
structure?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1091
PS8, Line 1091: return
Why skip only when execute "merge" operation?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1102
PS8, Line 1102: Finish to clear
There is just an INFO log even if there is an error occurred ?

And the log seems too verbos if there are thousands of containers.


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1113
PS8, Line 1113: result
For the container failed to process, is the result correct and meaningful to be 
print? How about just skip them?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1132
PS8, Line 1132: finished_directories_count++;
It is in expect to consider the directory "finished" if skipped from line 1069?


http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/tools/tool_action_fs.cc@1297
PS8, Line 1297: It supports 4 operations:\n"
              :                         "--print: Statistic the total blocks 
and alive blocks "
              :                         "for every container.\n"
              :                         "--compact: Compact metadata files when 
Kudu is still running.\n"
              :                         "--merge: Stop Kudu and write 
incremental records into compacted "
              :                         "metadata files. Then backup old 
metadata files.\n"
              :                         "--clear: Delete metadata backup 
files.\n
It would be more helpful if moving these descriptions to metadata_op_type 
gflag, it is shown when execute "kudu fs compact_metadata --help".

The help info is also needed to simplify describe how to use it, in which 
order, what's the impact of each step. And a detail document is needed to be 
added to website if this patch merged.



--
To view, visit http://gerrit.cloudera.org:8080/19327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id61cb6c88d7a093a8448e2497639f5c5c094efa4
Gerrit-Change-Number: 19327
Gerrit-PatchSet: 8
Gerrit-Owner: Wang Xixu <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Wed, 31 Jan 2024 08:13:52 +0000
Gerrit-HasComments: Yes

Reply via email to