[kudu-CR] threadpool: token-based task sequencing

2017-06-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: threadpool: token-based task sequencing
..


threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Reviewed-on: http://gerrit.cloudera.org:8080/6874
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon 
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 991 insertions(+), 91 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-06-27 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 14: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-26 Thread Adar Dembo (Code Review)
Hello David Ribeiro Alves, Mike Percy, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#13).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 991 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-06-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

got it. I'm working on figuring out a better way to manage my gerrit queue so I 
don't get confused by stuff like this

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-23 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> hm, did you forget to post the newest rev with this change?
I'm still working on feedback in the other patches in this series. My MO has 
been to respond to comments as I make the changes locally to the affected 
patch, then repush the whole thing when it's ready.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-23 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 138: // 2. CONCURRENT: submitted tasks may be run in parallel. This isn't 
unlike
> Suppose you have n contexts sharing a single threadpool. If tokens aren't u
got it.


PS11, Line 190: 
  :   // Waits for the pool to reach the idle state, or until 
'delta' time elapses.
  :   // Returns true if the pool reached the idle state, false 
otherwise
> Er, why? It is a "new paragraph" within the NewToken() documentation, so I 
I meant that the whole comment block belongs with the function rather than the 
definition, but it's ok


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> Moot now that pool->lock is referenced directly.
hm, did you forget to post the newest rev with this change?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-22 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

leaving it open in case todd wants to take another look

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-22 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-21 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS12, Line 134: m
> Is this different than pool_->lock_ ?
It's not. I should just reference that directly to avoid confusion.


Line 196:   while (state() != State::QUIESCED) {
> Could use a comment indicating who is responsible for transitioning the tok
Done


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> Would be nice to mention what 'm' is for.
Moot now that pool->lock is referenced directly.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-20 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12: Code-Review+1

(3 comments)

Looks pretty good to me

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS12, Line 134: m
Is this different than pool_->lock_ ?


Line 196:   while (state() != State::QUIESCED) {
Could use a comment indicating who is responsible for transitioning the token 
to QUIESCED.


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
Would be nice to mention what 'm' is for.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-20 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 441:   std::deque entries;
> warning: invalid case style for private member 'entries' [readability-ident
Done


Line 445:   ConditionVariable not_running_cond;
> warning: invalid case style for private member 'not_running_cond' [readabil
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-20 Thread Adar Dembo (Code Review)
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#12).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 990 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 135: // Furthermore, tasks submitted to tokens constructed via 
ExecutionMode::SERIAL
  : // are guaranteed not to run in parallel.
> the first time I read through this (before reading the code) I thought the 
Will update.


Line 138: // Tasks submitted without a token or via ExecutionMode::CONCURRENT 
tokens are
> can you remind me what the advantage of CONCURRENT tokens is, vs just shari
Suppose you have n contexts sharing a single threadpool. If tokens aren't used 
for submission, there's no way for one of the contexts to atomically:
1. Abort any of its queued tasks.
2. If any are running, wait for them to finish.
3. Prohibit future submission.

ThreadPool::Shutdown() is inappropriate because it applies to the entire pool, 
not just to this one context. I don't know exactly what to call this property, 
so the description is a little vague ("...can be used to block on logical 
groups of tasks").

Per-context metrics is useful, but it's not necessary for correctness the way 
this is.


Line 190:   // Allocates a new token for use in token-based task submission.
> can you document the requirements around token lifetime relative to pool li
Done


PS11, Line 190: // Allocates a new token for use in token-based task submission.
  :   //
  :   // There is no limit on the number of tokens that may be 
allocated.
> nit: I think this comment belongs below the enum rather than above it
Er, why? It is a "new paragraph" within the NewToken() documentation, so I 
think it should be part of the "Allocates a new token..." block of text.


PS11, Line 298: whom
> nit: which
Done


Line 330:   ~ThreadPoolToken();
> I think worth noting whether the token-owner is responsible for doing a Shu
OK. The destructor will call Shutdown(), so I'll note that doing so explicitly 
is not a requirement.


PS11, Line 332:   // Submits a function using the kudu Closure system.
  :   Status SubmitClosure(Closure c) WARN_UNUSED_RESULT;
  : 
  :   // Submits a function bound using boost::bind(, 
args...).
  :   Status SubmitFunc(boost::function f) 
WARN_UNUSED_RESULT;
  : 
  :   // Submits a Runnable class.
  :   Status Submit(std::shared_ptr r) WARN_UNUSED_RESULT;
> do we still need all of these variants? I seem to recall that we have fewer
All three variants are in use in unit tests. The functor variant is also used 
by peers, and the closure variant is also used by the message queue and raft 
consensus.


PS11, Line 342: If a task is in flight
> nit: should be plural, since you can make a CONCURRENT token, right?
Done


Line 361:  private:
> What's the thread safety on the members here? Are they protected by the thr
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 135: // Furthermore, tasks submitted to tokens constructed via 
ExecutionMode::SERIAL
  : // are guaranteed not to run in parallel.
the first time I read through this (before reading the code) I thought the 
SERIAL vs CONCURRENT was a property of a task rather than a token, and it 
confused me a bit. Maybe rephrase a little so it says something like:

A ThreadPoolToken operates in one of two ExecutionModes:
- SERIAL: tasks submitted via this token run one at a time
- CONCURRENT: tasks may be run concurrently blah blah

... so it's just a little more obvious that the modes are set per token and not 
per task


Line 138: // Tasks submitted without a token or via ExecutionMode::CONCURRENT 
tokens are
can you remind me what the advantage of CONCURRENT tokens is, vs just sharing 
one big threadpool across multiple usage contexts? Is it to take advantage of 
per-token metrics? Might be worth noting something about this in the doc.


Line 190:   // Allocates a new token for use in token-based task submission.
can you document the requirements around token lifetime relative to pool 
lifetime? namely I think you have to destroy all tokens before destroying the 
pool?


PS11, Line 190: // Allocates a new token for use in token-based task submission.
  :   //
  :   // There is no limit on the number of tokens that may be 
allocated.
nit: I think this comment belongs below the enum rather than above it


PS11, Line 298: whom
nit: which


Line 330:   ~ThreadPoolToken();
I think worth noting whether the token-owner is responsible for doing a 
Shutdown() and/or Wait() before allowing it to be destructed.


PS11, Line 332:   // Submits a function using the kudu Closure system.
  :   Status SubmitClosure(Closure c) WARN_UNUSED_RESULT;
  : 
  :   // Submits a function bound using boost::bind(, 
args...).
  :   Status SubmitFunc(boost::function f) 
WARN_UNUSED_RESULT;
  : 
  :   // Submits a Runnable class.
  :   Status Submit(std::shared_ptr r) WARN_UNUSED_RESULT;
do we still need all of these variants? I seem to recall that we have fewer in 
use than actually defined, so if we aren't using all three, maybe we can use 
this new API as an opportunity to consolidate?


PS11, Line 342: If a task is in flight
nit: should be plural, since you can make a CONCURRENT token, right?


Line 361:  private:
What's the thread safety on the members here? Are they protected by the 
threadpool's lock? think it's worth a block comment explaining.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

I took a look at the other patch. Todd is a better arbitrer as to correctness 
but in general the solution seems reasonable.
I honestly would still prefer not to have to mandate that all tokens are 
returned to the pool (i.e. enforce that all tasks are finished and no future 
tasks can run, but don't force all threads hanging on to tokens to return 
them), from a library design perspective. This is specially problematic if we 
ever have more than one thread pool in which the running tasks have cyclic 
dependencies (i.e. tasks from a tp A submit to a tp B and vice versa). But I 
guess that the end goal is to have a single tp, making the point less relevant 
in this case.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 194: processed
> nit: maybe "executed" is a better term?
Done


PS11, Line 197: will
> s/will/may
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

> @adar: did you end up with another solution to the problem you were
 > seeing with tokens still being alive when the tp was destroyed?

Yes, I posted a WIP here: https://gerrit.cloudera.org/7183. Still waiting for 
Todd or someone else to take a look.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-19 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 11:

(2 comments)

@adar: did you end up with another solution to the problem you were seeing with 
tokens still being alive when the tp was destroyed?

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 194: processed
nit: maybe "executed" is a better term?


PS11, Line 197: will
s/will/may


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-12 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 10:

@adar i guess that the first explicit call to Shutdown() is racing with the the 
rest of the stuff. i.e. the token’s shutdown completes and the shutdown thread 
thinks it’s released but there are actually other tasks hanging on to tokens 
even though they haven’t yet submitted. Of course if you did let them submit 
they might cause SIGSEGV’s cause the pool was destroyed or somesuch.

My suggestion would be:
- Make ThreadPool ref counted, make tokens hold a ref. (to make sure the tp 
survives and to avoid adding locks inside of ThreadPoolToken).
- On Token shutdown actually release the token from the pool after you’ve 
quiesced all the tasks(but the tokens ref for the pool will only be reduced on 
the dctor).
- Make the pool accept but return a bad status for unknown/quiesced tokens 
(which I already think it does, just spelling it out).

This way the shutdown thread will clear all pending tasks and make sure that 
the running tasks complete but if there are places still hanging on to tokens 
that should still be ok.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-11 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#10).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 979 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-06-11 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 9:

The implementation has changed significantly. Here's a summary:
- Tokens can now run tasks either serially (the old SequenceToken 
implementation) or concurrently. Execution mode notwithstanding, tokens serve 
as a way to logically group tasks, so that Wait()/Shutdown() block on them and 
not on all tasks in the pool.
- Tokenless submission now goes through a single concurrent token.
- Slots and tokens have been unified. This has reduced the amount of code as 
well as the number of states in the (now token) state machine.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-06-11 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#9).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 981 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-06-06 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 278: if (!s->not_running_cond.TimedWait(delta)) {
> on spurious wakeup, won't this wait longer than expected? I think implement
Good point, but this is how ThreadPool::WaitFor() works today and I don't think 
SequenceToken::WaitFor() is any worse. Can I make the change in a subsequent 
patch? We don't yet have a deadline-based cv wait either.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 188: Upon completion, 't' has been released
> not following this bit. I'm reading this to mean "Release()" below, but tha
Yeah, this is confusing, sorry. When I wrote "'t' has been released" what I 
meant is that 't' is no longer usable as a token, not that its slot had been 
released. I'll reword.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 278: if (!s->not_running_cond.TimedWait(delta)) {
on spurious wakeup, won't this wait longer than expected? I think implementing 
WaitUntil in terms of a deadline cond var wait would be better, and convert 
this one to use the deadline?


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 188: Upon completion, 't' has been released
not following this bit. I'm reading this to mean "Release()" below, but that 
doesn't make sense. I'd expect 'this' to be released prior to the move.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-05 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 8: -Code-Review

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS8, Line 105: bool HasCollected() const {
> docs
Done


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 40: #include "kudu/util/threadpool.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


PS8, Line 418: t3 = std::move(t1);
> agree with tidy bot that these uses after move are weird
But how else am I to test that the move constructor and move assignment 
operator properly "zero out" the moved instance?

To squelch the clang-tidy warnings I added NOLINT to these lines.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 130: const SequenceToken::SlotIndex SequenceToken::kInvalidSlotIndex;
> warning: redundant 'kInvalidSlotIndex' declaration [readability-redundant-d
Not redundant; if I remove this, threadpool-test fails to link.


PS8, Line 173: (std::shared_ptr
> nit: make_shared?
Done


PS8, Line 188: pool_->CheckNotPoolThreadUnlocked();
> what is the point of this check? can you leave a comment?
It's called in every user-facing function that may block, to ensure that the 
caller isn't a ThreadPool worker thread.

I don't think it makes sense for there to be a comment explaining that in every 
call-site; best to look at the (short) implementation to learn what's being 
checked.


PS8, Line 258: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
 :  s->state() == ThreadPool::SequenceSlot::State::RUNNING 
||
 :  s->state() == ThreadPool::SequenceSlot::State::QUIESCING
> isn't this equivalent to IsActive()
Yeah, will replace.


PS8, Line 275: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
 :  s->state() == ThreadPool::SequenceSlot::State::RUNNING 
||
 :  s->state() == ThreadPool::SequenceSlot::State::QUIESCING
> same.
Done


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 140: Wait variants
> shutdown is also thread safe, right?
Yes. I went back and forth on whether to call it out; I reasoned folks would 
interpret "lifecycle functions" as "ctor/dtor and assignment operators".

But apparently not, so I'll make it more explicit.


PS8, Line 156: Waits until all the tasks submitted via this token are completed 
and also
 :   // marks the token as unusable for future submissions.
> mind reverting this sentence? I reads like we wait first and then close the
You mean reversing? Done.


PS8, Line 212: Tasks submitted via one token are fairly prioritized with 
respect to tasks
 : // submitted via another token.
> not sure I understand this... how are tasks on a token prioritized over ano
Tokens are processed in a round-robin style. When a worker thread processes a 
token, it runs just one of its queued tasks. In this way, we can ensure that 
one token cannot starve other tokens.

I'll add more detail.


PS8, Line 360: QUIESCED  -> AVAILABLE:
> why do we need to transition the state back to available?
Only AVAILABLE tokens may be returned to clients via AllocateTokenSlot(). If a 
token remained in QUIESCED forever, it couldn't be reused by another client.


PS8, Line 424: StackTrace running_submit_stack;
> what's the perf impact of this? I realize this is a debug only thing but we
I ran "perf report" on the new fuzz unit test I added and Collect() consumed 
1.42% CPU, which I think is reasonable.


Line 516: std::ostream& operator<<(std::ostream& o, 
ThreadPool::SequenceSlot::State s);
> warning: redundant 'operator<<' declaration [readability-redundant-declarat
This isn't redundant; if I remove it I get compilation errors in threadpool.cc.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-05 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 8:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS8, Line 105: bool HasCollected() const {
docs


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

PS8, Line 418: t3 = std::move(t1);
agree with tidy bot that these uses after move are weird


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS8, Line 173: (std::shared_ptr
nit: make_shared?


PS8, Line 188: pool_->CheckNotPoolThreadUnlocked();
what is the point of this check? can you leave a comment?


PS8, Line 258: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
 :  s->state() == ThreadPool::SequenceSlot::State::RUNNING 
||
 :  s->state() == ThreadPool::SequenceSlot::State::QUIESCING
isn't this equivalent to IsActive()


PS8, Line 275: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
 :  s->state() == ThreadPool::SequenceSlot::State::RUNNING 
||
 :  s->state() == ThreadPool::SequenceSlot::State::QUIESCING
same.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 140: Wait variants
shutdown is also thread safe, right?


PS8, Line 156: Waits until all the tasks submitted via this token are completed 
and also
 :   // marks the token as unusable for future submissions.
mind reverting this sentence? I reads like we wait first and then close the 
close the token


PS8, Line 212: Tasks submitted via one token are fairly prioritized with 
respect to tasks
 : // submitted via another token.
not sure I understand this... how are tasks on a token prioritized over another 
token? how do you chose which token get priority?


PS8, Line 360: QUIESCED  -> AVAILABLE:
why do we need to transition the state back to available?


PS8, Line 424: StackTrace running_submit_stack;
what's the perf impact of this? I realize this is a debug only thing but we 
don't want debug modes to be abnormally slow. Consider gating this behind a 
flag if the perf impact is considerable.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-06-03 Thread Adar Dembo (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#8).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token goes
out of scope, the slot is returned to the pool. The token may also be shut
down before being destroyed, which has semantics equivalent to shutting down
the pool itself (i.e. tasks may no longer be submitted with the token).

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 1,168 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-05-26 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 7: Code-Review-2

See the note I left in https://gerrit.cloudera.org/#/c/6946/6 about 
implementing SequenceToken::Shutdown().

Until I do that, please don't merge this.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-05-26 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 7: Code-Review+1

LGTM.  Maybe other reviewers have some other things to point at.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-05-26 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#7).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 955 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-05-26 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS6, Line 105: {
> add 'const' specifier?
Done


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS6, Line 333: DEALLOCATED
> AVAILABLE
Done


PS6, Line 344: DEALLOCATED
> AVAILABLE
Done


Line 435:   SequenceSlot* slot = slot_index ? slots_[*slot_index].get() : 
nullptr;
> paranoid nit: does it make sense to add something like
Done


Line 473: DCHECK(state == SequenceSlot::ASSIGNED ||
> Why not (state != SequenceSlot::AVAILABLE)?
I find explicitly enumerating the allowed states to be more clear. It's also 
easier to notice mistakes if/when a new state is introduced.

Yes, the guarantee is that if DoSubmit() was called with a slot index, it must 
have originated with a token, which means the slot cannot be AVAILABLE.

I'm not a fan of EMPTY because I can see myself confusing it with "the slot's 
queue is empty of tasks" (i.e. ASSIGNED).


PS6, Line 708: }
> Does it make sense to add CHECK(!task) for the case new_state == ASSIGNED?
Yes, I'll add that.


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS6, Line 177: ;
> add 'const' specifier?
Done


Line 304:   boost::optional 
slot_index);
> Could you add a comment explaining the expected usage with slot_index (slot
Sure.


Line 398: void CheckStateTransition(State new_state, const 
boost::optional& task);
> add 'const' specifier?
Done


PS6, Line 453: Queues
> nit: Queue
No, this is a container of slots, and each slot is logically a queue.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-26 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 473: DCHECK(state == SequenceSlot::ASSIGNED ||
Why not (state != SequenceSlot::AVAILABLE)?

Just checking that I understand this correctly, this implies that the caller 
knows the SequenceSlot at slot_index is not in the AVAILABLE state, right? And 
this is ensured by the fact that only a SequenceToken will DoSubmit with a 
slot_index.

I know it's gone through name changes already, but how do you feel about EMPTY 
instead of AVAILABLE? My thinking, at least for this bit of logic, is that it's 
clearer that a task shouldn't be assigned to an empty slot


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 304:   boost::optional 
slot_index);
Could you add a comment explaining the expected usage with slot_index 
(slot_index must refer to a SequenceSlot that is occupied by a token)?


PS6, Line 453: Queues
nit: Queue


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS4, Line 345: unique_ptr new_slot(new SequenceSlot(_));
 : DCHECK_EQ(SequenceSlot::DEALLOCATED, new_slot->state());
 : slots_.emplace_back(std::move(new_slot));
 : idx = slots_.size() - 1;
 :   }
 : 
 :   slots_[idx]->Transition(SequenceSlot::INACTIVE);
> So why do we actually need both DEALLOCATED and INACTIVE? Because we need t
ok, I see: the broadcasting on conditional variable is no-op in that case.  
Thank you for the clarification.  Yes, the new names for the states in PS6 look 
better than the former ones.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS6, Line 105: {
add 'const' specifier?


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS6, Line 333: DEALLOCATED
AVAILABLE


PS6, Line 344: DEALLOCATED
AVAILABLE


Line 435:   SequenceSlot* slot = slot_index ? slots_[*slot_index].get() : 
nullptr;
paranoid nit: does it make sense to add something like

DASSERT(!slot_index || *slot_index < slots_.size());

?


PS6, Line 708: }
Does it make sense to add CHECK(!task) for the case new_state == ASSIGNED?


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS6, Line 177: ;
add 'const' specifier?


Line 398: void CheckStateTransition(State new_state, const 
boost::optional& task);
add 'const' specifier?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-25 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419:   ASSERT_EQ(SequenceToken::kInvalidSlotIndex, t2.slot_index_);
> I'm not seeing why; can you elaborate?
woops, I misunderstood the test


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-25 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#6).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 947 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-05-25 Thread Adar Dembo (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#5).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 947 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-05-24 Thread Adar Dembo (Code Review)
Hello Dan Burkert, David Ribeiro Alves, Todd Lipcon,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#4).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
m contexts to share a single pool with n threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 943 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] threadpool: token-based task sequencing

2017-05-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419: TEST_F(ThreadPoolTest, TestTokenSubmitsProcessedSerially) {
> I think this may fail on a single CPU system.
I'm not seeing why; can you elaborate?


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS3, Line 146: std::move
> I think you want std::forward here.
I can't say I understand the difference between move() and forward() (even 
after reading an article or two about it), but I've changed both of these moves 
to be forwards.


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 198:   DISALLOW_COPY_AND_ASSIGN(SequenceToken);
> I was curious about this, so I looked it up in Effective Modern C++ (page 1
I think I will since it adds information for people who aren't familiar with 
that aspect of move operations.


PS3, Line 203: FIFO
> nit: missing "queue"
Done


PS3, Line 220: The execution order will be A1, T1, T2, A2, A3.
> this is a _possible_ execution order, right?
Dan: A2 isn't allowed to run until A1 has finished. You're right though that if 
A1 finishes before A2 is submitted, A2 will run second. I'll clarify that.

I should have clarified that this execution order presupposes that the tasks 
are long-running (i.e. that A3 will be submitted before any task finishes).


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-23 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 3:

(2 comments)

oops had two lingering comments, not sure if relevant, but might as well push

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS3, Line 203: FIFO
nit: missing "queue"


PS3, Line 220: The execution order will be A1, T1, T2, A2, A3.
this is a _possible_ execution order, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-23 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419: TEST_F(ThreadPoolTest, TestTokenSubmitsProcessedSerially) {
I think this may fail on a single CPU system.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-23 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS3, Line 146: std::move
I think you want std::forward here.


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 198:   DISALLOW_COPY_AND_ASSIGN(SequenceToken);
I was curious about this, so I looked it up in Effective Modern C++ (page 111):

> Declaring a move operation (construction or assignment) in a class causes 
> compilers to disable the copy operations. (The copy operations are disabled 
> by deleting them—see Item 11).

So this should be safe to leave off.  That being said, It's perfectly fine with 
me if you leave it.


Line 220: // The execution order will be A1, T1, T2, A2, A3.
What happens if A1 begins running before A2 is submitted?  Won't A2 run second 
in that case?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 3: Verified+1

Failure was in kinit in a Java test.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] threadpool: token-based task sequencing

2017-05-22 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 45: using strings::Substitute;
> warning: using decl 'Substitute' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 168:   return Submit(std::shared_ptr(new 
ClosureRunnable(std::move(c;
> warning: std::move of the const variable has no effect; remove std::move() 
Done


Line 380:   return Submit(std::shared_ptr(new 
ClosureRunnable(std::move(c;
> warning: std::move of the const variable has no effect; remove std::move() 
Done


http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 285:   Status DoSubmit(std::shared_ptr task,
> warning: function 'kudu::ThreadPool::DoSubmit' has a definition with differ
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] threadpool: token-based task sequencing

2017-05-22 Thread Adar Dembo (Code Review)
Hello Dan Burkert, David Ribeiro Alves, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#2).

Change subject: threadpool: token-based task sequencing
..

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
m contexts to share a single pool with n threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 708 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon