[jira] [Commented] (AMQ-4720) Messages lost after fail-back of a network connector using priorityBackup=true - reason is that remote broker isn't checking producerID & is rejecting because of duplicate producerSequence

Gary Tully (JIRA) Tue, 17 Sep 2013 05:25:59 -0700

    [ 
https://issues.apache.org/jira/browse/AMQ-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769471#comment-13769471
 ]


Gary Tully commented on AMQ-4720:
---------------------------------

this is expected, network connectors cannot deal with failover reconnect and a 
connection to a priority backup is a reconnect.
masterslave: is designed to use failover: just to choose a url from the list so 
if the list is priority ordered you should get what you need. 
The issue will be falling back when the priority url comes alive again. That 
may need some tweaking of the failover url directly and may not be possible atm.

                
> Messages lost after fail-back of a network connector using 
> priorityBackup=true - reason is that remote broker isn't checking producerID 
> & is rejecting because of duplicate producerSequence
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-4720
>                 URL: https://issues.apache.org/jira/browse/AMQ-4720
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Connector
>    Affects Versions: 5.8.0
>         Environment: Only tested on Windows 7 Enterprise x64.
>            Reporter: Andrew May
>              Labels: failback, failover, networkBridge,, networkConnector
>         Attachments: amq1.log, amq1.xml, amq2.log, amq2.xml, amq3.log, 
> amq3.xml
>
>
> Summary of problem:
> -------------------
> If a static failover network connector is setup to connect to 2 other brokers 
> & to fail-back to a priority broker; messages can be lost after fail-back 
> because the remote broker deletes them due to duplicate producer-sequence 
> numbers even though the producer-id has changed.
> My suspicion is that the remote broker doesn't recognise that the 
> re-established connection is a network connection & so doesn't check 
> producer-id.
> Test-harness setup:
> -------------------
> Using ActiveMQ 5.8.0 binary download.
> Only changes are to logging settings & to the configuration file.
> 3 brokers ("amq1", "amq2", "amq3"), all brokers running on localhost.
> Each uses their own config file (amq1.xml, amq2.xml, amq3.xml)
> Broker amq1 has a failover duplex connection to amq2.
> Broker amq3 has a duplex failover connection to both amq1 + amq2, it is 
> configured to always try to connect to amq1 first ("randomize=false") and to 
> fail-back to amq1 if it comes back online ("priorityBackup=true")
> Consumer connects to broker amq1
> Producer connects to broker amq3
> Test-harness sender application creates a new session each time it is run & 
> sends a set of messages.
> The sending session is not transacted & is set to auto-acknowledge.
> Messages are sent with persistent delivery mode.
> Messages are on queue "MyQueue"
> Test script:
> ------------
> Start all 3 brokers.
> Broker amq3 establishes a connection to amq1.
> Broker amq1 establishes a connection to amq2.
> Consumer connects to amq2 & starts consuming queue "MyQueue".
> Producer connects to amq3 & sends 10 messages on queue "MyQueue" - these are 
> all passed on to broker amq1 which forwards them to amq2 where they are 
> delivered to the consumer.
> Producer connects to amq3 & sends 10 messages on queue "MyQueue" - these are 
> all delivered as before - N.B. producerID is different as this is a new 
> connection.
> Broker amq1 is shut down.
> Broker amq3 fails-over to connect to amq2.
> Producer connects to amq3 & sends 10 messages on queue "MyQueue" - these are 
> all passed directly to amq2 where they are delivered to the consumer - (as 
> before, the producer-id has changed).
> Producer connects to amq3 & sends 10 messages on queue "MyQueue" - these are 
> delivered as before.
> Broker amq1 is restarted 
> Broker amq1 re-establishes its connection to amq2.
> Broker amq3 notices that amq1 is available & fails-back to it.
>     - Broker amq3 closes its connection to amq2
>     - Broker amq3 starts a new connection to amq1
> Producer connects to amq3 & sends 10 messages on queue "MyQueue" - these are 
> all passed directly to amq2 where they are delivered to the consumer - (as 
> before, the producer-id has changed).
>     - N.B. Immediately before the first message is received & forwarded by 
> amq1, amq1's log shows:
>         2013-09-11 12:05:56,639 | DEBUG | last stored sequence id set: -1 | 
> org.apache.activemq.broker.ProducerBrokerExchange | ActiveMQ Transport: 
> tcp:///172.16.7.85:56880@61616
>       ---> This message only appears after fail-back, it doesn't appear 
> earlier.
>            This is indicative of the network connection being treated 
> differently after fail-back.
> **********************
> ** Error occurs now **
> **********************
> ** Producer connects to amq3 & sends 20 messages on queue "MyQueue" (with a 
> different producer-ID)
>     - The first 10 are deleted by broker amq2 because it thinks that they 
> have a duplicate sequence ID.
>     - amq1 log shows:
>         2013-09-11 12:06:29,201 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56895-1378897588954-0:1:1:1:1] with producerSequenceId [1] 
> less than last stored: 10 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>         2013-09-11 12:06:29,223 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56895-1378897588954-0:1:1:1:2] with producerSequenceId [2] 
> less than last stored: 10 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>         ... snip ...
>         2013-09-11 12:06:29,396 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56895-1378897588954-0:1:1:1:10] with producerSequenceId [10] 
> less than last stored: 10 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>     - The last 10 are successfully forwarded to amq2, where they are consumed.
> ** Producer connects to amq3 & sends 30 messages on queue "MyQueue" (with a 
> different producer-ID)
>     - The first 20 are deleted by broker amq2 because it thinks that they 
> have a duplicate sequence ID.
>     - amq1 log shows:
>         2013-09-11 12:06:45,668 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56899-1378897605440-0:1:1:1:1] with producerSequenceId [1] 
> less than last stored: 20 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>         2013-09-11 12:06:45,682 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56899-1378897605440-0:1:1:1:2] with producerSequenceId [2] 
> less than last stored: 20 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>         ... snip ...
>         2013-09-11 12:06:45,959 | DEBUG | suppressing duplicate message send 
> [ID:bd7ewandymay-56899-1378897605440-0:1:1:1:20] with producerSequenceId [20] 
> less than last stored: 20 | org.apache.activemq.broker.ProducerBrokerExchange 
> | ActiveMQ Transport: tcp:///172.16.7.85:56880@61616
>     - The last 10 are successfully forwarded to amq2, where they are consumed.
> It looks to me as if amq1 doesn't realise that the fail-back network 
> connection established by amq3 is a network connection & so isn't checking 
> producer IDs.
> Details of why I'm trying this configuration:
> ---------------------------------------------
> Use case:
> ---------
> 1 central site.
> Multiple branches, each with a single branch server and multiple user PCs.
> Each branch only has 1 internet connection that is shared by branch server & 
> PCs.
> Branch server is typically unreliable hardware & may go offline without 
> notice.
> Resilience to network loss is important & so each PC & server has its own 
> broker.
> Both branch server & PCs need to be able to communicate with the centre
> To reduce the number of connections into the centre, we would like a tree 
> topology with the branch server concentrating all branch PC messages & 
> forwarding them to the centre.
> But, PCs generate a data feed that we want to be able to access at centre, 
> even when the branch server is offline.
> Proposed configuration:
> -----------------------
> Use a failover network connection on branch PCs & configure the connection to 
> prioritise a connection to the branch server, but open a direct connection to 
> the centre if the branch server is unavailable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AMQ-4720) Messages lost after fail-back of a network connector using priorityBackup=true - reason is that remote broker isn't checking producerID & is rejecting because of duplicate producerSequence

Reply via email to