[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918335#comment-16918335 ] Anton Vinogradov commented on IGNITE-3195: -- Forgot to record this: Solution merged to the master branch. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 4h > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918302#comment-16918302 ] Anton Vinogradov commented on IGNITE-3195: -- [~Jokser] Only striped pool will provide bad a performance. See benchmark results above. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 4h > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917846#comment-16917846 ] Pavel Kovalenko commented on IGNITE-3195: - [~avinogradov] I have a question regarding the completed change. Why 2 thread-pools is used for rebalancing? Why it's worse if we leave the only a striped pool for all messages? > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 4h > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917802#comment-16917802 ] Alexei Scherbakov commented on IGNITE-3195: --- [~avinogradov] 1. This theoretically should work. I'm going to contribute a bunch of follow-up fixes by IGNITE-12038 very soon, let's check TC again after. 2. OK, looks like ordered messages should provide necessary ordering. No more objections from me, thanks. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 4h > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917646#comment-16917646 ] Maxim Muzafarov commented on IGNITE-3195: - [~avinogradov], All the minor issues have been discussed with you privately and resolved. Changes look good to me, please go ahead. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917619#comment-16917619 ] Ignite TC Bot commented on IGNITE-3195: --- {panel:title=Branch: [pull/6688/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} [TeamCity *--> Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=4537951&buildTypeId=IgniteTests24Java8_RunAll] > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16917614#comment-16917614 ] Anton Vinogradov commented on IGNITE-3195: -- Discussed privately with [~ascherbakov]. 1) Historical rebalance semantic was not changed due to data foss found at TeamCity regression. Issue IGNITE-12117 created to un-stripe it. 2) Reordering is not possible during messages registration since ordered topic used, so no problem here. 3) It is possible to have X2 load in the following cases: 3.1 node supplies and demands simultaneously 3.2 node demands in a historical and regular way simultaneously that seems to be not a production case 3.1 can be solved by baseline and good affinity 3.2 almost impossible at a real environment to have some partition rebalanced using historical rebalance when others should be rebalanced using a regular one. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916855#comment-16916855 ] Alexei Scherbakov commented on IGNITE-3195: --- [~avinogradov] Overall fix looks good, but I think we could improve it. 1. Looks like it's safe to remove ordering for historical rebalance because after IGNITE-10078 rmvQeue for partition is no longer cleared during rebalance and removals cannot be lost. Given what, we could use single thread pool for historical and full rebalance and parallelize historical rebalance on supplier side same as full. This is right thing to do because from user side of view there is no difference between full and historical rebalance and they can happen simultaneously. Note, proper fix for writing tombstones is on the way [1] 2. Looks like current implementation for detecting partition completion on concurrent processing using *queued *and *processed *is flawed. Consider the scenario: Demander sends initial demand request for single partition. Supplier replies with 2 total supply messages which are starting to process in parallel. 2-d message is last. 2-d message started to process first, increments *queued *to N (number of entries in message) 2-d message finished processing, incrementing *processed *to N. Because this is last message partition will be owned before other messages are applied. [1] https://issues.apache.org/jira/browse/IGNITE-11704 > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916567#comment-16916567 ] Alexei Scherbakov commented on IGNITE-3195: --- [~avinogradov] I'll take a look. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912226#comment-16912226 ] Anton Vinogradov commented on IGNITE-3195: -- Folks, I've benchmarked the solution. 1 huge partition was rebalanced from one node to another using 4 threads, both nodes were started at same JVM at my laptop. It took *170* seconds at master and *95* at my branch. Speedup mostly caused by *unstriped* pool usage at a demand node. This does not mean that regular rebalance will be speeded up twice, but this means that case with a long rebalancing of final partitions now solved. Also, I've improved striped pool to be eliminated on inactivity, the unstriped pool had this feature before. So you'll never see rebalance-*-poll at thread-dumps in case your cluster performed no rebalance the last time. [~Jokser], Could you please join the review? > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912133#comment-16912133 ] Ignite TC Bot commented on IGNITE-3195: --- {panel:title=Branch: [pull/6688/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} [TeamCity *--> Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=4522863&buildTypeId=IgniteTests24Java8_RunAll] > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893344#comment-16893344 ] Ignite TC Bot commented on IGNITE-3195: --- {panel:title=Branch: [pull/6688/head] Base: [master] : No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} [TeamCity *--> Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=4394852&buildTypeId=IgniteTests24Java8_RunAll] > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 3h > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890034#comment-16890034 ] Maxim Muzafarov commented on IGNITE-3195: - [~avinogradov] Sure, I will take a look, shortly! > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 50m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887745#comment-16887745 ] Anton Vinogradov commented on IGNITE-3195: -- [~Mmuzaf], [~xtern] I've prepared the PoC. Please prereview the code. Going to perform benchmarks on real environment after your check. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 40m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887742#comment-16887742 ] Ignite TC Bot commented on IGNITE-3195: --- {panel:title=--> Run :: All: No blockers found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel} [TeamCity *--> Run :: All* Results|https://ci.ignite.apache.org/viewLog.html?buildId=4347484&buildTypeId=IgniteTests24Java8_RunAll] > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 40m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879060#comment-16879060 ] Anton Vinogradov commented on IGNITE-3195: -- Folks, we have a lot of problems with rebalance 1) uncontrolled rebalance thread pool size 2) during long rebalance possible situation that some threads do nothing but some have a long queue 3) code looks bad and should be refactored (simplified!), a see a lot of things checked 2+ times at different places, some check and actions are useless. Currently I'm in refactoring stage. Just removed 1/2 of code and now it able to understand how it works :) > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879031#comment-16879031 ] Stanilovsky Evgeny commented on IGNITE-3195: look like we clash the same problem on blt change: {code:java} 2019-07-04 06:29:03.649[WARN ][sys-#328%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#328%DPL_GRID%DplGridNodeName% for timeout(ms)=16335 2019-07-04 06:29:03.649[WARN ][sys-#326%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#326%DPL_GRID%DplGridNodeName% for timeout(ms)=13438 2019-07-04 06:29:03.649[WARN ][sys-#277%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#277%DPL_GRID%DplGridNodeName% for timeout(ms)=11609 2019-07-04 06:29:03.649[WARN ][sys-#331%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#331%DPL_GRID%DplGridNodeName% for timeout(ms)=18009 2019-07-04 06:29:03.649[WARN ][sys-#321%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#321%DPL_GRID%DplGridNodeName% for timeout(ms)=15557 2019-07-04 06:29:03.650[WARN ][sys-#307%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#307%DPL_GRID%DplGridNodeName% for timeout(ms)=27938 2019-07-04 06:29:03.649[WARN ][sys-#316%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#316%DPL_GRID%DplGridNodeName% for timeout(ms)=12189 2019-07-04 06:29:03.649[WARN ][sys-#311%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#311%DPL_GRID%DplGridNodeName% for timeout(ms)=11056 2019-07-04 06:29:03.650[WARN ][sys-#295%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#295%DPL_GRID%DplGridNodeName% for timeout(ms)=20848 2019-07-04 06:29:03.649[WARN ][sys-#290%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#290%DPL_GRID%DplGridNodeName% for timeout(ms)=14816 2019-07-04 06:29:03.649[WARN ][sys-#332%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#332%DPL_GRID%DplGridNodeName% for timeout(ms)=14110 2019-07-04 06:29:03.649[WARN ][sys-#298%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#298%DPL_GRID%DplGridNodeName% for timeout(ms)=10028 2019-07-04 06:29:03.650[WARN ][sys-#304%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#304%DPL_GRID%DplGridNodeName% for timeout(ms)=19855 2019-07-04 06:29:03.650[WARN ][sys-#331%DPL_GRID%DplGridNodeName%][o.a.i.i.p.c.p.pagemem.PageMemoryImpl] Parking thread=sys-#331%DPL_GRID%DplGridNodeName% for timeout(ms)=41277 ... and so on {code} > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov >Priority: Major > Labels: iep-16 > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoMa
[jira] [Commented] (IGNITE-3195) Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated
[ https://issues.apache.org/jira/browse/IGNITE-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184135#comment-16184135 ] Vladimir Ozerov commented on IGNITE-3195: - Moved to 2.4 due to inactivity. > Rebalancing: IgniteConfiguration.rebalanceThreadPoolSize is wrongly treated > --- > > Key: IGNITE-3195 > URL: https://issues.apache.org/jira/browse/IGNITE-3195 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Denis Magda >Assignee: Anton Vinogradov > Fix For: 2.4 > > > Presently it's considered that the maximum number of threads that has to > process all demand and supply messages coming from all the nodes must not be > bigger than {{IgniteConfiguration.rebalanceThreadPoolSize}}. > Current implementation relies on ordered messages functionality creating a > number of topics equal to {{IgniteConfiguration.rebalanceThreadPoolSize}}. > However, the implementation doesn't take into account that ordered messages, > that correspond to a particular topic, are processed in parallel for > different nodes. Refer to the implementation of > {{GridIoManager.processOrderedMessage}} to see that for every topic there > will be a unique {{GridCommunicationMessageSet}} for every node. > Also to prove that this is true you can refer to this execution stack > {noformat} > java.lang.RuntimeException: HAPPENED DEMAND > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:378) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:364) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:622) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:320) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$300(GridCacheIoManager.java:81) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1125) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1219) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2456) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1179) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$1900(GridIoManager.java:105) > at > org.apache.ignite.internal.managers.communication.GridIoManager$6.run(GridIoManager.java:1148) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > All this means that in fact the number of threads that will be busy with > replication activity will be equal to > {{IgniteConfiguration.rebalanceThreadPoolSize}} x > number_of_nodes_participated_in_rebalancing -- This message was sent by Atlassian JIRA (v6.4.14#64029)