Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17942 )

Change subject: [LBM] Speed up server bootstrap by using multi-thread to 
compact containers
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17942/2/src/kudu/fs/log_block_manager.cc@2151
PS2, Line 2151:     container_result->need_repunching_blocks.begin(),
> 1. In LogBlockManager::Repair, we will wait all metadata compact workers co
Ah, sure.  My point is the following: dd->ExecClosure in the original code is 
already scheduling a task on a pool (i.e. running the task asynchronously) if 
the pool has enough capacity, otherwise it's executing the task in this thread 
synchronously.

Introducing a new thread pool helps to solve the problem of waiting on some 
particular task in this loop to be run synchronously while other tasks might be 
still scheduled on theirs thread pools when those pools still have enough 
capacity, right?  There is nothing else which is run in parallel since the 
actual repair tasks are run on those dir's pools anyways.  So, my point was: 
why not just to replace ExecClosure() with some other method which only 
schedules the task if the underlying dir's pool has capacity, and otherwise it 
just makes a mark that some more tasks need to be scheduled and loops over 
(i.e. continues looping through the cycle) until all the tasks are scheduled.  
If no other tasks can be scheduled for async execution, it simply sleeps a bit 
(ideally, it should await on a conditional variable on the completion of one of 
the tasks scheduled prior).

This way there is no need to create extra threads, but the blocking 
dd->ExecClosure() is removed and allows for scheduling and running many tasks 
simultaneously.


http://gerrit.cloudera.org:8080/#/c/17942/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17942/5/src/kudu/fs/log_block_manager.cc@131
PS5, Line 131: Amount of latency in ms to inject before registering "
             :              "an op with MVCC.
> Done
nit: this is marked as done, but I don't see the update.  Is it planned to 
appear in PS6?


http://gerrit.cloudera.org:8080/#/c/17942/5/src/kudu/fs/log_block_manager.cc@133
PS5, Line 133: TAG_FLAG(log_container_metadata_rewrite_inject_latency_ms, 
advanced);
nit: I guess this tag isn't necessary since the flag is already hidden



--
To view, visit http://gerrit.cloudera.org:8080/17942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie48211d9e8c1d74e520fcb04df25c1d681261bb5
Gerrit-Change-Number: 17942
Gerrit-PatchSet: 5
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Greg Solovyev <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Wed, 03 Nov 2021 05:58:33 +0000
Gerrit-HasComments: Yes

Reply via email to