[
https://issues.apache.org/jira/browse/ARTEMIS-2854?focusedWorklogId=485165&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-485165
]
ASF GitHub Bot logged work on ARTEMIS-2854:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 16/Sep/20 14:08
Start Date: 16/Sep/20 14:08
Worklog Time Spent: 10m
Work Description: asfgit closed pull request #3228:
URL: https://github.com/apache/activemq-artemis/pull/3228
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 485165)
Time Spent: 1.5h (was: 1h 20m)
> Non-durable subscribers may stop receiving after failover
> ---------------------------------------------------------
>
> Key: ARTEMIS-2854
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2854
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.14.0
> Reporter: Howard Gao
> Assignee: Howard Gao
> Priority: Major
> Fix For: 2.16.0
>
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> In a cluster scenario where non durable subscribers fail over to backup while
> another live node forwarding messages to it, there is a chance that the the
> live node keeps the old remote binding for the subs and messages go to those
> old remote bindings will result in "finding not found".
> For example suppose there are 2 live-backup pairs in the cluster: Live1
> backup1
> Live2 and backup2. A non durable subscriber connects to Live1 and messages
> are sent to Live2 and then redistributed to the sub on Live1.
> Now Live1 crashes and backup1 becomes live. The subscriber fails over to
> backup1.
> In the mean time Live2 re-connects backup1 too. During the process Live2
> didn't
> successfully remove the old remote binding for the subs and it still point to
> the
> old temp queue's id (which is gone with the Live1 as it's a temp queue).
> So the messages (after failover) still are routed to the old queue which is
> no longer there. The subscriber will be idle without receiving new messages
> from it.
> The code concerned this :
> https://github.com/apache/activemq-artemis/blob/master/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/cluster/impl/ClusterConnectionImpl.java#L1239
> The code doesn't take care of the case where it's possible that the old
> remote binding is still in the map the it's key (clusterName) will be the
> same as the new remote binding (which references to a new temp queue)
> recreated on fail over.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)