Wang Xixu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19327 )

Change subject: [Tool] Compact container metadata file manually
......................................................................


Patch Set 6:

(44 comments)

http://gerrit.cloudera.org:8080/#/c/19327/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19327/2//COMMIT_MSG@9
PS2, Line 9:  log con
> nit: log container metadata
Done


http://gerrit.cloudera.org:8080/#/c/19327/2//COMMIT_MSG@10
PS2, Line 10: at
> nit: at
Done


http://gerrit.cloudera.org:8080/#/c/19327/2//COMMIT_MSG@12
PS2, Line 12: es to
> nit: which
Done


http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@7
PS5, Line 7:  file manual
> Why mention ”old version“, it can be used for any version of Kudu, right?
Yes, although, the new versions of kudu maybe not need when it sets the flag 
log_container_metadata_runtime_compact true


http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@7
PS5, Line 7: containe
> nit: container metadata
Done


http://gerrit.cloudera.org:8080/#/c/19327/5//COMMIT_MSG@20
PS5, Line 20: K
> nit: remove
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.h
File src/kudu/fs/log_block_manager.h:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.h@207
PS5, Line 207:
> Add some comments for these public functions.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@563
PS5, Line 563:
> Add some comments.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@571
PS5, Line 571:
> Add some comments.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@860
PS5, Line 860:   }
> It is similar to ProcessRecords(), is it possible to refactor and reuse som
No, the function ProcessRecords is a member of class LogBlockContainer and 
requires opens fs_manager, but this is not.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@1210
PS5, Line 1210:   return Status::OK();
> Is it possible to reuse exist version of LogBlockContainer::ProcessRecord?
No, the exited version of LogBlockContainer::ProcessRecord requires opening the 
fs_manager and is a member function of class LogBlockContainer.

But this function this static and no needs to open fs_mananger.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2626
PS5, Line 2626: atus LogBlockManager::CompactLowLiveRatioBlockContainer(Env* 
env, Dir* dir,
              :                                                           const 
string& container_name,
              :                                                           
uint64_t* live_block_num) {
              :   BlockRecordMap live_block_records;
              :   uint64_t offset = 0;
              :   uint64_t total_blocks = 0;
              :   string metadata_path = StrCat(JoinPathSegments(dir->dir(), 
container_name),
              :                                 kContainerMeta
> Is it the same to PrintLowLiveRatioBlockContainer()?
Yes


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2634
PS5, Line 2634:   RETURN_NOT_OK(LogBlockContainer::ProcessRecordsAndGetOffset(
> If total_blocks is 0, return directly?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2637
PS5, Line 2637:
> nit: low live
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2651
PS5, Line 2651:     RETURN_NOT_OK(GenerateOffsetFile(env, dir, container_name, 
offset));
> There is not any "print" operation, "Print" can be removed.
Sorry, I don't know what you mean? this function is used by print operation.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2688
PS5, Line 2688:   unique_ptr<RandomAccessFile> offset_file_reader;
> nit: Add file path in log.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2695
PS5, Line 2695:   }
> nit: Maybe a verbos level is enough?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2697
PS5, Line 2697:   fs.resize(file_size);
> nit: Add file path in log. Same to other places.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2700
PS5, Line 2700:
> nit: use StrCat and JoinPathSegments
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2706
PS5, Line 2706:
> nit: align to 'r' of "rw_opts".
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2711
PS5, Line 2711: ::M
> Should be "new"?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2713
PS5, Line 2713: _pb_writer
> nit: use StrCat and JoinPathSegments
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2723
PS5, Line 2723:
> Write?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2735
PS5, Line 2735: ad_status = old_metadata_pb_reader.ReadNextPB(&record);
              :     if (!read_status.ok()) {
> PREDICT_FALSE(!read_status.IsEndOfFile() && !read_status.IsIncomplete()) ?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2739
PS5, Line 2739:   RETURN_NOT_OK(compacted_pb_writer.Append(record));
              :   }
              :   if (PREDICT_TRUE(!read_status.IsEndOfFile())
              :       && PREDICT_TRUE(!read_status.IsIncomplete())) {
              :     return read_status;
              :   }
> nit: the width is less than 100 if in one line?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2747
PS5, Line 2747: e
> nit: alignment
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2751
PS5, Line 2751: f
> nit: alignment
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2761
PS5, Line 2761:
> nit: use JoinPathSegments.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2763
PS5, Line 2763: ring& container_name) {
> If user don't want to do merge operation, make it possible to clear ".offse
OK, I will remove .offset and .compact file too.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/fs/log_block_manager.cc@2775
PS5, Line 2775: t_file)) {
> nit: use StrCat and JoinPathSegments
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc
File src/kudu/tools/tool_action_fs.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@123
PS5, Line 123:  a
> nit: and
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@125
PS5, Line 125: rint" |
> It would be greate if use the declared variables below (i.e. kMetadata*Op)
Here, it can not access the variables below in L167 ~ L170.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@125
PS5, Line 125:   if (!(value == "print" || value == "compact" ||
> How about add some error logs when validate failed?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@867
PS5, Line 867:
> It would be better to describe why to add the file lock.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@869
PS5, Line 869:
> nit: lock_filename?
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@884
PS5, Line 884:     env->UnlockFile(file_lock);
> Give a hint message to shutdown the server if lock instance file failed in
I can not predicate precisely the error from fs_manager.Open(), so it is not 
suitable to hint the the users that should shutdown the server. Maybe it 
returns other errors(i.e disk full).

But here can add some comments that should shutdown the server when executing 
merge operation.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@945
PS5, Line 945: s = LogBlockManager::ClearIntermediateFile(env, dd.get(),
             :                                                        
container_name);
             :           LOG(INFO) << "Finish to clear container: " << 
container_name
             :                     << " metadata backup file, "
             :                     << " result: " << s.ToString();
             :         } else {
             :           DCHECK_EQ(kMetadataPrintOp, FLAGS_op_type);
> There maybe hundreds of contianers, the output is not convenient to read. H
I think the users don't care about the information of every container, they 
care about the total number of containers, blocks, alive blocks.

How about just printing the total information at the last stage.


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@954
PS5, Line 954:                                                                
container_name,
> Since it's a "fatal" error, the process will exit, so at least ERROR level
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@970
PS5, Line 970:   if (FLAGS_op_type == kMetadataPrintOp) {
> It would be helpful if pass the error detail (Status) out and log it here.
If a container failed to be handled, L954 will print the detail, it no needs to 
print the error information again. Otherwise, if more than one container 
failed, print which one here?


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@972
PS5, Line 972:               << ", total blocks: " << total_blocks
> Is it necesary? the error will be logged above, we can check whether it's s
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/tools/tool_action_fs.cc@1094
PS5, Line 1094:                                                   }, ", ")))
> Add some description of how to use the tool. For example, 'compact' should
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h
File src/kudu/util/pb_util.h:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h@325
PS5, Line 325:   // Set the offset of underlying file.
> Add some necessary descriptions.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.h@448
PS5, Line 448:
> Add some necessary descriptions.
Done


http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.cc
File src/kudu/util/pb_util.cc:

http://gerrit.cloudera.org:8080/#/c/19327/5/src/kudu/util/pb_util.cc@786
PS5, Line 786:   DCHECK_LE(offset_, offset);
> Should be protected by offset_lock_?
Done



--
To view, visit http://gerrit.cloudera.org:8080/19327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id61cb6c88d7a093a8448e2497639f5c5c094efa4
Gerrit-Change-Number: 19327
Gerrit-PatchSet: 6
Gerrit-Owner: Wang Xixu <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Thu, 20 Apr 2023 11:15:40 +0000
Gerrit-HasComments: Yes

Reply via email to