David Ribeiro Alves has posted comments on this change. Change subject: consensus: consolidate Raft thread pools ......................................................................
Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/6946/6//COMMIT_MSG Commit Message: PS6, Line 8: : This patch consolidates the two thread pools used in Raft consensus: the : observer and "Raft" (for lack of a better name) pools. The former runs tasks : for infrequent events such as term and config changes while the latter is : used for periodic events such as processing heartbeat responses. : : In this patch, each per-replica pool is consolidated into a single : per-server pool. Sequence tokens are used to ensure that tasks belonging to : a single replica are executed serially and in order. For the "Raft" pool, a : single sequence token is shared by the PeerManager and RaftConsensus; this : mirrors the existing sharing behavior and is safe because token operations : are thread-safe. : : The non-default max_threads may come as a surprise, but David showed me how : tasks submitted to either pool may sometimes lead to an fsync (mostly via : cmeta flushes). As such, it isn't safe to use the default num_cpus upper : bound, as that may cause such IO-based tasks to block other CPU-only tasks : (or worse, periodic tasks like heartbeat response processing). : : What per-replica threads are not addressed by this patch? : - Failure detection thread: a threadpool isn't the right model : for these. Something like a hash wheel timer would be ideal. : - Prepare thread pool (max_threads=1): these could be consolidated too, but : their metrics are per-replica, and sequence tokens don't (yet) support : that. Meaning, if they were consolidated now, the metrics would also : consolidate and would be less useful as a result. would be good to have a few runs of raft_consensus-itest on dist-test with this change. iirc you did that at some point, mind re-running those and posting the results here? particularly the *crashynodes* and *churnyelections* tests http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/consensus/consensus_queue.cc File src/kudu/consensus/consensus_queue.cc: PS6, Line 114: ThreadPool* raft_observers_pool noticed that, for consensus peers, you're passing the token whereas here you're passing the threadpool? any particular reason for that? if not ming making that consistent? PS6, Line 917: raft_observers_pool_token_.Wait() shutdown on a tp prevents more ops from being submitted while this doesn't, is that a problem? http://gerrit.cloudera.org:8080/#/c/6946/2/src/kudu/consensus/consensus_queue.h File src/kudu/consensus/consensus_queue.h: PS2, Line 434: // The pool token which executes observe remove? also maybe add some info about the experiments you made where you found out it wasn't safe to run these in parallel (you did those iirc, right?) http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: PS6, Line 192: raft_pool on the peer manager side you call this request_* something. mind using the same name as the pool (like raft_pool_token) so that it's easier to track where it runs? PS6, Line 1743: raft_pool_token_->Wait(); again, theres a change in semantics from "close the tp and wait for the ongoing stuff to happen" to "wait for the ongoing stuff to happen even if more is added" seems like we should have the same semantics in the token slots? http://gerrit.cloudera.org:8080/#/c/6946/6/src/kudu/server/server_base.cc File src/kudu/server/server_base.cc: PS6, Line 255: queue-observers-pool use "raft_observers_pool" or something -- To view, visit http://gerrit.cloudera.org:8080/6946 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8958c607d11ac2e94908670c985059bef0ff5f54 Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Adar Dembo <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
