Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17942 )
Change subject: [LBM] Speed up server bootstrap by using multi-thread to compact containers ...................................................................... Patch Set 2: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc File src/kudu/fs/log_block_manager.cc: http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc@2964 PS2, Line 2964: Status s = RewriteMetadataFile(*(container.get()), e.second, &file_bytes_delta); At first I thought it seemed like the operation we're parallelizing here is just the appending to protobuf and writing to disk. While appending to protobuf does take CPU, that realization made me weary that this change actually improved things, since we might be IO-bound anyway. However, it seems like the Repair() call is already happening in each data dirs' threadpools, i.e. the CPU intensive part of checking each container is already very parallelized. Have you seen this patch make a measurable performance improvement? Consider adding a small benchmark like LogBlockManagerTest.StartupBenchmark with some percentage of live blocks to be compacted at startup, to see how much this patch improves in the best case, and average case. -- To view, visit http://gerrit.cloudera.org:8080/17942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie48211d9e8c1d74e520fcb04df25c1d681261bb5 Gerrit-Change-Number: 17942 Gerrit-PatchSet: 2 Gerrit-Owner: Yingchun Lai <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Greg Solovyev <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Mon, 25 Oct 2021 22:01:09 +0000 Gerrit-HasComments: Yes
