[kudu-CR] WIP: threadpool: token-based sequencing

Adar Dembo (Code Review) Fri, 12 May 2017 17:51:06 -0700

Adar Dembo has posted comments on this change.

Change subject: WIP: threadpool: token-based sequencing
......................................................................



Patch Set 1:

> (1 comment)
 > - does this affect the performance of a normal ThreadPool that
 > doesn't use the token feature? From looking at the code it seems
 > like it would just be a few branches and no extra allocation, etc,
 > but would be good to be sure of that
 
Yes, that was my intent. However, we can also go in one of two other directions:
1. Recognize that there's no real use case for token and non-token submission 
in the same pool and further split them up. Some ideas here:
a. A common ThreadPool interface with two implementations, sharing code where 
possible.
b. Expose the difference in behavior via templatization (i.e. traits).
Neither of these approaches are particularly clean due to the difference in 
Submit() and Wait() (though I'll admit I didn't think about it very hard) which 
is why I haven't pursued them yet.
2. Reduce code and complexity by implementing non-token submission as 
token-based submission. Dan argued for this when he and were discussing the 
idea yesterday; it's not a bad one if we further optimize token-based 
submission and avoid allocations.

 > - not sure about using a string for a token vs having a more opaque
 > token, such as ThreadPool::ObtainSequencer() returning a
 > 'Sequencer' struct (which could well be just an int or somesuch)
 
Yes, Dan suggested something similar (though he went a step further and 
suggested that Submit()/Wait() should happen through this opaque "handle"). The 
advantage of using an int is that we could convert the map of token queues into 
a array or vector, and explicit token lifecycle means we can probably avoid 
shared ownership on token queues.

The disadvantage is that now ThreadPool users have to manage the lifecycle of 
the token, though you could argue they had to already in that they probably 
must call WaitFor(token) before destructing anyway.

 > - docs wise, would be good to document fairness and
 > starvation-freedom properties. Particularly, the fact that this
 > exhibits neither :-D Probably fine since I think we would typically
 > use this feature in threadpools with no max thread count specified.
 > (is it worth actually prohibiting use of tokens in a pool with a
 > max thread count? or is it just "YMMV" situation?)

Agreed. The current implementation prioritizes non-token over token submission. 
Submission order is preserved for all tasks belonging to a token. The use of a 
FIFO for token rotation means one busy token can't starve the rest.

Note that "no max thread count" means "max of <num_cpus> threads", so it's very 
likely that num_tokens > num_cpus.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: No

[kudu-CR] WIP: threadpool: token-based sequencing

Reply via email to