Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19327 )

Change subject: [Tool] Compact metadata file for old versions of Kudu
......................................................................


Patch Set 5:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@7
PS5, Line 7: old versions
Why mention ”old version“, it can be used for any version of Kudu, right?


http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@7
PS5, Line 7: metadata
nit: container metadata


http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@20
PS5, Line 20: #
nit: remove


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.h@207
PS5, Line 207:
Add some comments for these public functions.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@563
PS5, Line 563:
Add some comments.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@571
PS5, Line 571:   static Status ProcessRecord(
Add some comments.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@860
PS5, Line 860: Status LogBlockContainer::ProcessRecordsAndGetOffset(
It is similar to ProcessRecords(), is it possible to refactor and reuse some 
code?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2626
PS5, Line 2626: BlockRecordMap live_block_records;
              :   uint64_t offset = 0;
              :   uint64_t total_blocks = 0;
              :   string metadata_path = StrCat(JoinPathSegments(dir->dir(), 
container_name),
              :                                 kContainerMetadataFileSuffix);
              :   RETURN_NOT_OK(LogBlockContainer::ProcessRecordsAndGetOffset(
              :       env, metadata_path, &live_block_records, &total_blocks, 
&offset));
              :   *live_block_num = live_block_records.size();
Is it the same to PrintLowLiveRatioBlockContainer()?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2634
PS5, Line 2634:   total_blocks = total_blocks == 0 ? 1 : total_blocks;
If total_blocks is 0, return directly?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2637
PS5, Line 2637: low
nit: low live


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2651
PS5, Line 2651: Status LogBlockManager::PrintLowLiveRatioBlockContainer(Env* 
env, Dir* dir,
There is not any "print" operation, "Print" can be removed.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2761
PS5, Line 2761: /
nit: use JoinPathSegments.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2763
PS5, Line 2763: kMetadataBackupFileSuffix
If user don't want to do merge operation, make it possible to clear ".offset" 
and ".compact" files.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@123
PS5, Line 123: ,
nit: and


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@125
PS5, Line 125:   return (value == "print" || value == "compact" || value == 
"merge" || value == "clear");
How about add some error logs when validate failed?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@125
PS5, Line 125: "print"
It would be greate if use the declared variables below (i.e. kMetadata*Op) and 
add the valid values to a set to validate flags.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@867
PS5, Line 867: Use file lock
It would be better to describe why to add the file lock.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@869
PS5, Line 869: filename
nit: lock_filename?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@945
PS5, Line 945: LOG(INFO) << "Finish to print container: " << container_name
             :                     << ", live block count: " << live_block_count
             :                     << ", total block count: " << 
total_block_count
             :                     << ", live block ratio: " << 
(live_block_count / total_block_count)
             :                     << ", result: " << s.ToString();
             :           total_blocks.fetch_add(total_block_count);
             :           total_live_blocks.fetch_add(live_block_count);
There maybe hundreds of contianers, the output is not convenient to read. How 
about use HdrHistogram to statistics the distributon of all containers' size, 
live block ratio, live block count and etc, out of the loop?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@954
PS5, Line 954:           WARN_NOT_OK(s, "Container handle failed");
Since it's a "fatal" error, the process will exit, so at least ERROR level log 
would be better.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@970
PS5, Line 970:     return Status::RuntimeError("Some tasks failed.");
It would be helpful if pass the error detail (Status) out and log it here.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@972
PS5, Line 972:   LOG(INFO) << "Operation finished, all tasks succeed.";
Is it necesary? the error will be logged above, we can check whether it's 
success or not.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@1094
PS5, Line 1094:       .Description("Compact the metadata file of all log block 
containers")
Add some description of how to use the tool. For example, 'compact' should be 
run before 'merge'.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h
File src/kudu/util/pb_util.h:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h@325
PS5, Line 325:   void SetOffset(uint64_t offset);
Add some necessary descriptions.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h@448
PS5, Line 448:   void SetOffset(uint64_t offset);
Add some necessary descriptions.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.cc
File src/kudu/util/pb_util.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.cc@786
PS5, Line 786:   DCHECK_LE(offset_, offset);
Should be protected by offset_lock_?



--
To view, visit http://gerrit.cloudera.org:8080/19327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id61cb6c88d7a093a8448e2497639f5c5c094efa4
Gerrit-Change-Number: 19327
Gerrit-PatchSet: 5
Gerrit-Owner: Wang Xixu <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Wed, 15 Mar 2023 12:41:36 +0000
Gerrit-HasComments: Yes

Reply via email to