Adar Dembo has submitted this change and it was merged. (
Change subject: KUDU-1913: cap number of threads on server-wide pools
KUDU-1913: cap number of threads on server-wide pools
The last piece of work is to establish an upper bound on the number of
threads that may be started in the Raft and Prepare server-wide threadpools.
Such caps will make it easier for admins to reason about appropriate values
for the configuration of the Kudu processes' RLIMIT_NPROC resource.
KUDU-1913 proposed a cap of "number of cores + number of disks", but a
lively Slack discussion yielded a better solution: set the cap at some
percentage of the process' RLIMIT_NPROC value. Given that the rest of Kudu
generally uses a constant number of threads, this should prevent spikes from
ever exceeding the RLIMIT_NPROC and crashing the server due to an election
storm. This patch implements a cap of 10% per pool and also provides a new
gflag as an "escape hatch" (in case we were horribly wrong).
Note: it's still possible for a massive number of "hot" replicas to exceed
RLIMIT_NPROC by virtue of each replica's log append thread, but the server
is more likely to run out of memory for MemRowSets before that happens.
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <davidral...@gmail.com>
Reviewed-by: Todd Lipcon <t...@apache.org>
1 file changed, 57 insertions(+), 11 deletions(-)
Kudu Jenkins: Verified
David Ribeiro Alves: Looks good to me, approved
Todd Lipcon: Looks good to me, but someone else must approve
To view, visit http://gerrit.cloudera.org:8080/9522
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>