David Ribeiro Alves has posted comments on this change.

Change subject: consensus: consolidate Raft thread pools
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6946/6//COMMIT_MSG
Commit Message:

PS6, Line 8: 
           : This patch consolidates the two thread pools used in Raft 
consensus: the
           : observer and "Raft" (for lack of a better name) pools. The former 
runs tasks
           : for infrequent events such as term and config changes while the 
latter is
           : used for periodic events such as processing heartbeat responses.
           : 
           : In this patch, each per-replica pool is consolidated into a single
           : per-server pool. Sequence tokens are used to ensure that tasks 
belonging to
           : a single replica are executed serially and in order. For the 
"Raft" pool, a
           : single sequence token is shared by the PeerManager and 
RaftConsensus; this
           : mirrors the existing sharing behavior and is safe because token 
operations
           : are thread-safe.
           : 
           : The non-default max_threads may come as a surprise, but David 
showed me how
           : tasks submitted to either pool may sometimes lead to an fsync 
(mostly via
           : cmeta flushes). As such, it isn't safe to use the default num_cpus 
upper
           : bound, as that may cause such IO-based tasks to block other 
CPU-only tasks
           : (or worse, periodic tasks like heartbeat response processing).
           : 
           : What per-replica threads are not addressed by this patch?
           : - Failure detection thread: a threadpool isn't the right model
           :   for these. Something like a hash wheel timer would be ideal.
           : - Prepare thread pool (max_threads=1): these could be consolidated 
too, but
           :   their metrics are per-replica, and sequence tokens don't (yet) 
support
           :   that. Meaning, if they were consolidated now, the metrics would 
also
           :   consolidate and would be less useful as a result.
would be good to have a few runs of raft_consensus-itest on dist-test with this 
change. iirc you did that at some point, mind re-running those and posting the 
results here? particularly the *crashynodes* and *churnyelections* tests


http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

PS6, Line 114: ThreadPool* raft_observers_pool
noticed that, for consensus peers, you're passing the token whereas here you're 
passing the threadpool? any particular reason for that? if not ming making that 
consistent?


PS6, Line 917: raft_observers_pool_token_.Wait()
shutdown on a tp prevents more ops from being submitted while this doesn't, is 
that a problem?


http://gerrit.cloudera.org:8080/#/c/6946/2/src/kudu/consensus/consensus_queue.h
File src/kudu/consensus/consensus_queue.h:

PS2, Line 434:   // The pool token which executes observe
remove? also maybe add some info about the experiments you made where you found 
out it wasn't safe to run these in parallel (you did those iirc, right?)


http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS6, Line 192: raft_pool
on the peer manager side you call this request_* something. mind using the same 
name as the pool (like raft_pool_token) so that it's easier to track where it 
runs?


PS6, Line 1743: raft_pool_token_->Wait();
again, theres a change in semantics from "close the tp and wait for the ongoing 
stuff to happen" to "wait for the ongoing stuff to happen even if more is 
added" seems like we should have the same semantics in the token slots?


http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/server/server_base.cc
File src/kudu/server/server_base.cc:

PS6, Line 255: queue-observers-pool
use "raft_observers_pool" or something


-- 
To view, visit http://gerrit.cloudera.org:8080/6946
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8958c607d11ac2e94908670c985059bef0ff5f54
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to