Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14743 )

Change subject: KUDU-3001 Multi-thread to load containers in a data directory
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc
File src/kudu/fs/log_block_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14743/5/src/kudu/fs/log_block_manager.cc@2411
PS5, Line 2411:     record.set_op_type(DELETE);
> Thanks for  introducing ThreadPoolToken. But I found that we can't Wait a t
Ah, yeah I forgot about that restriction. And it'd be difficult to address.

We could use a "continuation" design pattern where OpenDataDir doesn't need to 
wait on the token by moving all of the post-wait work into the last LoadRecords 
call, but that'll require more state and synchronization.

So let's use your first approach instead, but let's firm up the abstraction 
somewhat. Let's define two first-class per-data-dir citizen pools:
- One for "control" operations (i.e. operations where throughput isn't a 
consideration). This is the existing data dir threadpool, with just one thread 
per pool. It'll be used for temp directory cleanup and for the OpenDataDir fan 
out.
- One for "data" operations. This will be a new threadpool with a configurable 
number of max threads (by gflag and perhaps, in the future, by hardware 
characteristics). Should it be in the DataDir abstraction or should it be in 
the LBM? I'm not sure, but either way let's use it for hole punching as well as 
for parallel container loading.

The split is arbitrary and it'll be difficult to decide when to use one pool 
vs. the other, but the abstraction will hopefully make the code more 
understandable.

Here's another idea: could we avoid the multi-level parallelism altogether? 
What if LBM::Open iterated over the data directories and fanned out calls to 
LoadRecords? Then when all records were loaded, it could fan out calls to 
Repair (one per data dir). If we could get that to work without too much 
additional state/synchronization, we could stick to the original idea of 
expanding the parallelism of the data dir's existing thread pool.



--
To view, visit http://gerrit.cloudera.org:8080/14743
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0721ee4a5a6824db146ba0658e60eec25dd0c65c
Gerrit-Change-Number: 14743
Gerrit-PatchSet: 9
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: wangning <[email protected]>
Gerrit-Comment-Date: Wed, 27 Nov 2019 23:06:42 +0000
Gerrit-HasComments: Yes

Reply via email to