[jira] [Commented] (ARTEMIS-2174) Broker reconnect to another with scale down policy cause OOM

2018-11-14 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686852#comment-16686852
 ] 

ASF subversion and git services commented on ARTEMIS-2174:
--

Commit 173b21e6e23828f595ff9101e043071e69d9aa8a in activemq-artemis's branch 
refs/heads/2.6.x from [~gaohoward]
[ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=173b21e ]

ARTEMIS-2174 Broker reconnect cause OOM with scale down

When a node tries to reconnects to another node in a scale down cluster,
the reconnect request gets denied by the other node and keeps retrying,
which causes tasks in the ordered executor accumulate and eventually OOM.

The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
to allow reconnect if the scale down node is the node itself.

(cherry picked from commit 6e89b22eaae8cd82852ae3d0a643bb3502cf994c)


> Broker reconnect to another with scale down policy cause OOM
> 
>
> Key: ARTEMIS-2174
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2174
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.6.3
>Reporter: Howard Gao
>Assignee: Howard Gao
>Priority: Major
> Fix For: 2.6.4
>
>
> When a node tries to reconnects to another node in a scale down cluster, the 
> reconnect request gets denied by the other node and keeps retrying, which 
> causes tasks in the ordered executor accumulate and eventually OOM.
> To reproduce:
>  # Start 2 nodes (node1 and 2) cluster configured in scale down mode.
>  # stop node2 and restart it.
>  # node1 will try to reconnect to node2 repeatedly and ever succeed.
>  # Inspect the connecting ClientSessionFactory (like adding log) and its 
> threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to 
> its queue.
> Over the time the queue keeps ever growing, and will exhaust the heap memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARTEMIS-2174) Broker reconnect to another with scale down policy cause OOM

2018-11-14 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686682#comment-16686682
 ] 

ASF subversion and git services commented on ARTEMIS-2174:
--

Commit 6e89b22eaae8cd82852ae3d0a643bb3502cf994c in activemq-artemis's branch 
refs/heads/master from [~gaohoward]
[ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=6e89b22 ]

ARTEMIS-2174 Broker reconnect cause OOM with scale down

When a node tries to reconnects to another node in a scale down cluster,
the reconnect request gets denied by the other node and keeps retrying,
which causes tasks in the ordered executor accumulate and eventually OOM.

The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
to allow reconnect if the scale down node is the node itself.


> Broker reconnect to another with scale down policy cause OOM
> 
>
> Key: ARTEMIS-2174
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2174
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.6.3
>Reporter: Howard Gao
>Assignee: Howard Gao
>Priority: Major
> Fix For: 2.6.4
>
>
> When a node tries to reconnects to another node in a scale down cluster, the 
> reconnect request gets denied by the other node and keeps retrying, which 
> causes tasks in the ordered executor accumulate and eventually OOM.
> To reproduce:
>  # Start 2 nodes (node1 and 2) cluster configured in scale down mode.
>  # stop node2 and restart it.
>  # node1 will try to reconnect to node2 repeatedly and ever succeed.
>  # Inspect the connecting ClientSessionFactory (like adding log) and its 
> threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to 
> its queue.
> Over the time the queue keeps ever growing, and will exhaust the heap memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARTEMIS-2174) Broker reconnect to another with scale down policy cause OOM

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686686#comment-16686686
 ] 

ASF GitHub Bot commented on ARTEMIS-2174:
-

Github user asfgit closed the pull request at:

https://github.com/apache/activemq-artemis/pull/2430


> Broker reconnect to another with scale down policy cause OOM
> 
>
> Key: ARTEMIS-2174
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2174
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.6.3
>Reporter: Howard Gao
>Assignee: Howard Gao
>Priority: Major
> Fix For: 2.6.4
>
>
> When a node tries to reconnects to another node in a scale down cluster, the 
> reconnect request gets denied by the other node and keeps retrying, which 
> causes tasks in the ordered executor accumulate and eventually OOM.
> To reproduce:
>  # Start 2 nodes (node1 and 2) cluster configured in scale down mode.
>  # stop node2 and restart it.
>  # node1 will try to reconnect to node2 repeatedly and ever succeed.
>  # Inspect the connecting ClientSessionFactory (like adding log) and its 
> threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to 
> its queue.
> Over the time the queue keeps ever growing, and will exhaust the heap memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARTEMIS-2174) Broker reconnect to another with scale down policy cause OOM

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ARTEMIS-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686389#comment-16686389
 ] 

ASF GitHub Bot commented on ARTEMIS-2174:
-

GitHub user gaohoward opened a pull request:

https://github.com/apache/activemq-artemis/pull/2430

ARTEMIS-2174 Broker reconnect cause OOM with scale down

When a node tries to reconnects to another node in a scale down cluster,
the reconnect request gets denied by the other node and keeps retrying,
which causes tasks in the ordered executor accumulate and eventually OOM.

The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
to allow reconnect if the scale down node is the node itself.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gaohoward/activemq-artemis e_2174

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/activemq-artemis/pull/2430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2430


commit 2a108ac8817cb6c2ca3b29092108a3e35e7f5690
Author: Howard Gao 
Date:   2018-11-14T11:21:48Z

ARTEMIS-2174 Broker reconnect cause OOM with scale down

When a node tries to reconnects to another node in a scale down cluster,
the reconnect request gets denied by the other node and keeps retrying,
which causes tasks in the ordered executor accumulate and eventually OOM.

The fix is to change the ActiveMQPacketHandler#handleCheckForFailover
to allow reconnect if the scale down node is the node itself.




> Broker reconnect to another with scale down policy cause OOM
> 
>
> Key: ARTEMIS-2174
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2174
> Project: ActiveMQ Artemis
>  Issue Type: Bug
>  Components: Broker
>Affects Versions: 2.6.3
>Reporter: Howard Gao
>Assignee: Howard Gao
>Priority: Major
> Fix For: 2.6.4
>
>
> When a node tries to reconnects to another node in a scale down cluster, the 
> reconnect request gets denied by the other node and keeps retrying, which 
> causes tasks in the ordered executor accumulate and eventually OOM.
> To reproduce:
>  # Start 2 nodes (node1 and 2) cluster configured in scale down mode.
>  # stop node2 and restart it.
>  # node1 will try to reconnect to node2 repeatedly and ever succeed.
>  # Inspect the connecting ClientSessionFactory (like adding log) and its 
> threadpool (closeExecutor an object of OrderedExecutor) keeps adding tasks to 
> its queue.
> Over the time the queue keeps ever growing, and will exhaust the heap memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)