[ 
https://issues.apache.org/jira/browse/IGNITE-20165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753233#comment-17753233
 ] 

Mirza Aliev edited comment on IGNITE-20165 at 8/14/23 9:04 AM:
---------------------------------------------------------------

By default, all executors are shared among the instance of Loza, meaning that 
all raft groups share executors.

Below I've represented all JRaft executors with short description and the 
number of threads  
||Pool name||Description||Number of Threads||
|JRaft-Common-Executor|A pool for processing short-lived asynchronous tasks. 
Should never be blocked.|Utils.cpus() (core == max)|
|JRaft-Node-Scheduler|A scheduled executor for running delayed or repeating 
tasks.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE, 
DelayedWorkQueue)|
|JRaft-Request-Processor|A default pool for handling RAFT requests. Should 
never be blocked.|Utils.cpus() * 6 (core == max)|
|JRaft-Response-Processor|A default pool for handling RAFT responses. Should 
never be blocked.|80 (core == max/3, workQueue == 10000)|
|JRaft-AppendEntries-Processor|A pool of single thread executors. Used only if 
a replication pipelining is enabled. Handles append entries requests and 
responses (used by the replication flow). Threads are started on demand. Each 
replication pair (leader-follower) uses dedicated single thread executor from 
the pool, so all messages between replication peer pairs are processed 
sequentially.|SystemPropertyUtil.getInt(
"jraft.append.entries.threads.send", Math.max(16, 
Ints.findNextPositivePowerOfTwo(cpus() * 2)));|
|NodeImpl-Disruptor|A striped disruptor for batching FSM (finite state machine) 
user tasks.|DEFAULT_STRIPES = Utils.cpus() * 2|
|ReadOnlyService-Disruptor|A striped disruptor for batching read requests 
before doing read index request.|DEFAULT_STRIPES = Utils.cpus() * 2|
|LogManager-Disruptor|A striped disruptor for delivering log entries to a 
storage.|DEFAULT_STRIPES = Utils.cpus() * 2|
|FSMCaller-Disruptor|A striped disruptor for FSM callbacks.|DEFAULT_STRIPES = 
Utils.cpus() * 2|
|SnapshotTimer|A timer for periodic snapshot creation.|Math.min(Utils.cpus() * 
3, 20) (core, max == Integer.MAX_VALUE)|
|ElectionTimer|A timer to handle election timeout on 
followers.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|
|VoteTimer|A timer to handle vote timeout when a leader was not confirmed by 
majority.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|
|StepDownTimer|A timer to process leader step down 
condition.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|


was (Author: maliev):
By default, all executors are shared among the instance of Loza, meaning that 
all raft groups share executors.

Below I've represented all JRaft executors with short description and the 
number of threads  
||Pool name||Description||Number of Threads||
|JRaft-Common-Executor|A pool for processing short-lived asynchronous tasks. 
Should never be blocked.|Utils.cpus() (core == max)|
|JRaft-Node-Scheduler|A scheduled executor for running delayed or repeating 
tasks.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|
|JRaft-Request-Processor|A default pool for handling RAFT requests. Should 
never be blocked.|Utils.cpus() * 6 (core == max)|
|JRaft-Response-Processor|A default pool for handling RAFT responses. Should 
never be blocked.|80 (core == max/3, workQueue == 10000)|
|JRaft-AppendEntries-Processor|A pool of single thread executors. Used only if 
a replication pipelining is enabled. Handles append entries requests and 
responses (used by the replication flow). Threads are started on demand. Each 
replication pair (leader-follower) uses dedicated single thread executor from 
the pool, so all messages between replication peer pairs are processed 
sequentially.|SystemPropertyUtil.getInt(
"jraft.append.entries.threads.send", Math.max(16, 
Ints.findNextPositivePowerOfTwo(cpus() * 2)));|
|NodeImpl-Disruptor|A striped disruptor for batching FSM (finite state machine) 
user tasks.|DEFAULT_STRIPES = Utils.cpus() * 2|
|ReadOnlyService-Disruptor|A striped disruptor for batching read requests 
before doing read index request.|DEFAULT_STRIPES = Utils.cpus() * 2|
|LogManager-Disruptor|A striped disruptor for delivering log entries to a 
storage.|DEFAULT_STRIPES = Utils.cpus() * 2|
|FSMCaller-Disruptor|A striped disruptor for FSM callbacks.|DEFAULT_STRIPES = 
Utils.cpus() * 2|
|SnapshotTimer|A timer for periodic snapshot creation.|Math.min(Utils.cpus() * 
3, 20) (core, max == Integer.MAX_VALUE)|
|ElectionTimer|A timer to handle election timeout on 
followers.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|
|VoteTimer|A timer to handle vote timeout when a leader was not confirmed by 
majority.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|
|StepDownTimer|A timer to process leader step down 
condition.|Math.min(Utils.cpus() * 3, 20) (core, max == Integer.MAX_VALUE)|

> Revisit the configuration of thread pools used by JRaft
> -------------------------------------------------------
>
>                 Key: IGNITE-20165
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20165
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Aleksandr Polovtcev
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>              Labels: ignite-3
>
> JRaft uses a bunch of thread pools to execute its operations. Most of these 
> thread pools use the number of CPUs to determine the amount of threads they 
> can use. For example, as described in IGNITE-20080, having 64 cores led to 
> JRaft allocating around 600 threads. Even though these thread pools are 
> shared between all Raft nodes, this approach is clearly sub-optimal, because 
> it should take into account both the amount of nodes as well as the number of 
> processors. It may also be beneficial to revise the amount of thread pools 
> used and why they are needed and reduce their number, if possible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to