[
https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910374#comment-16910374
]
Anton Vinogradov edited comment on IGNITE-3195 at 8/21/19 11:58 AM:
--------------------------------------------------------------------
[~Mmuzaf], [~xtern]
I've updated the PR.
Now, we have 2 thread pools for rebalance (any objections?).
1) Plain thread pool used to handle supplied messages in case they are not
historical.
2) Striped pool used to handle historical supply messages and all demand
messages.
Striped pool hashed by node id.
It looks like we able to get rig of striped pool in future, but for now it
looks like a good and simple solution.
Historical rebalance can be reordered in case we'll invent tombstones for
removes.
Supplier also theoretically able to be rewritten.
was (Author: avinogradov):
[~Mmuzaf], [~xtern]
I've updated the PR.
Now, we have 2 thread pools for rebalance (any objections?).
1) Plain thread pool used to handle supplied messages in case they are not
historical.
2) Striped pool used to handle historical supply messages and all demand
messages.
Striped pool hashed by node id.
It looks like we able to get rig of striped pool in future, but for now it
looks like a good and simple solution.
Historical rebalance can be reordered in case we'll invent tombstones for
removes.
Supplier also theoretically able to be rewritten.
BTW, according to my checks single partition rebalance speed increased almost
twice because of unstriped pool usage to handle supply messages.
So, addition to previous message is one more pool (striped).
> Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
> ---------------------------------------------------------------------------
>
> Key: IGNITE-3195
> URL: https://issues.apache.org/jira/browse/IGNITE-3195
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Reporter: Denis Magda
> Assignee: Anton Vinogradov
> Priority: Major
> Labels: iep-16
> Fix For: 2.8
>
> Time Spent: 3h 50m
> Remaining Estimate: 0h
>
> Presently it's considered that the maximum number of threads that has to
> process all demand and supply messages coming from all the nodes must not be
> bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> Current implementation relies on ordered messages functionality creating a
> number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}.
> However, the implementation doesn't take into account that ordered messages,
> that correspond to a particular topic, are processed in parallel for
> different nodes. Refer to the implementation of
> {{GridIoManager.processOrderedMessage}} to see that for every topic there
> will be a unique {{GridCommunicationMessageSet}} for every node.
> Also to prove that this is true you can refer to this execution stack
> {noformat}
> java.lang.RuntimeException: HAPPENED DEMAND
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> All this means that in fact the number of threads that will be busy with
> replication activity will be equal to
> {{IgniteConfiguration.rebalanceThreadPoolSize}} x
> number_of_nodes_participated_in_rebalancing
--
This message was sent by Atlassian Jira
(v8.3.2#803003)