Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/19327 )
Change subject: [Tool] KUDU-3318 Compact container metadata file manually ...................................................................... Patch Set 14: (7 comments) http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@981 PS8, Line 981: if (!read_status.ok()) { > Yes, offset is needs to be update at any time. 'offset' is used in the next I meant is it correct to update it when read_status is non-OK such as Status::Corruption? http://gerrit.cloudera.org:8080/#/c/19327/8/src/kudu/fs/log_block_manager.cc@2987 PS8, Line 2987: const std::vector<BlockRecordPB>& records) { > These functions are in log block manager. If using 'skip_block_manager = tr Oh, yes, I missed it. Anyway, most code of GenerateCompactedFile() and RewriteMetadataFile() are silimar, I don't think it's a good idea to keep so many duplicate code. RewriteMetadataFile() only access a few members of LogBlockManager, it's possible to pass them as parameters, then it can be static. http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc@2816 PS14, Line 2816: Status LogBlockManager::PrintLowLiveRatioBlockContainer(Env* env, Dir* dir, This function is similar to the upper part of function CompactLowLiveRatioBlockContainer(), how about refactoring the code and reuse it? http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc@2938 PS14, Line 2938: kMetadataOffsetFileSuffix, : kMetadataCompactedFileSuffix, : StrCat(kContainerMetadataFileSuffix, : kMetadataBackupFileSuffix) : nit: format the code. http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc@2943 PS14, Line 2943: for (const string& suffix : suffixes) { Now you are going to delete the files according to the container names, the container names are collected if the container is valid (a file with .data or .metadata postfix is found, by GetContainerNames()). But if the container is deleted (when exceed size limit and no live block left), the .offset, .backup files are left and have no chance to be deleted by this tool, right? http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc@3352 PS14, Line 3352: s = GetContainerNames(env_, dir->dir(), &containers_seen); In GetContainerNames implementation, will call GetChildren again which has called in line 3340, it's a bit of waste. http://gerrit.cloudera.org:8080/#/c/19327/14/src/kudu/fs/log_block_manager.cc@3353 PS14, Line 3353: if (!s.ok()) { nit: Simplify it by using RETURN_NOT_OK_PREPEND -- To view, visit http://gerrit.cloudera.org:8080/19327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id61cb6c88d7a093a8448e2497639f5c5c094efa4 Gerrit-Change-Number: 19327 Gerrit-PatchSet: 14 Gerrit-Owner: Wang Xixu <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Fri, 08 Mar 2024 09:43:00 +0000 Gerrit-HasComments: Yes
