Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17942 )

Change subject: [LBM] Speed up server bootstrap by using multi-thread to 
compact containers
......................................................................


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc@2964
PS2, Line 2964:       Status s = RewriteMetadataFile(*(container.get()), 
e.second, &file_bytes_delta);
At first I thought it seemed like the operation we're parallelizing here is 
just the appending to protobuf and writing to disk. While appending to protobuf 
does take CPU, that realization made me weary that this change actually 
improved things, since we might be IO-bound anyway. However, it seems like the 
Repair() call is already happening in each data dirs' threadpools, i.e. the CPU 
intensive part of checking each container is already very parallelized.

Have you seen this patch make a measurable performance improvement? Consider 
adding a small benchmark like LogBlockManagerTest.StartupBenchmark with some 
percentage of live blocks to be compacted at startup, to see how much this 
patch improves in the best case, and average case.



--
To view, visit http://gerrit.cloudera.org:8080/17942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie48211d9e8c1d74e520fcb04df25c1d681261bb5
Gerrit-Change-Number: 17942
Gerrit-PatchSet: 2
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Greg Solovyev <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Mon, 25 Oct 2021 22:01:09 +0000
Gerrit-HasComments: Yes

Reply via email to