Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14743 )

Change subject: KUDU-3001 Multi-thread to load containers in a data directory
......................................................................


Patch Set 5:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager-test.cc
File src/kudu/fs/log_block_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager-test.cc@1028
PS5, Line 1028:   const int kNumBlocks = AllowSlowTests() ? 
FLAGS_startup_benchmark_block_count_for_testing : 1000;
Probably cleaner to use OverrideFlagForSlowTests(). Then FLAGS_... is still the 
right place to go for the value; no need to think about an intermediary 
kNumBlocks variable.


http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@20
PS5, Line 20: #include <errno.h>
Why the reordering? In any case, since you're changing this, could you use 
<cerrno> instead?


http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@37
PS5, Line 37: #include <boost/bind.hpp>
I think you can use std::bind instead (from <functional>).


http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@2411
PS5, Line 2411:   // Create a per-dir thread pool.
Each DataDir has a singleton thread pool, currently used to run OpenDataDir. I 
think your goal here is to queue more than one operation per disk, since disks 
(especially SSDs and NVMEs) can run concurrent I/Os, right? If so, why not 
reuse that thread pool and make it a non-singleton? There could be positive 
knock-on effects to doing that, like better concurrent hole punching.

You'd still need to do the refactoring you did here, but you should be able to 
reuse those thread pools.


http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@2414
PS5, Line 2414:       .set_max_threads(5)
You should also experiment with different max thread values. Do you expect "one 
data dir + many max threads" per disk to be equivalent to "many data dirs + 
fewer max threads" per disk?


http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@2417
PS5, Line 2417:   vector<shared_ptr<FsReport>> local_report_vec;
These are all sized to "number of containers", right? Could you define a struct 
to include all of this, and one vector for that struct?



--
To view, visit http://gerrit.cloudera.org:8080/14743
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0721ee4a5a6824db146ba0658e60eec25dd0c65c
Gerrit-Change-Number: 14743
Gerrit-PatchSet: 5
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 20 Nov 2019 20:17:11 +0000
Gerrit-HasComments: Yes

Reply via email to