[ https://issues.apache.org/jira/browse/AMQ-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grzegorz Kluczek updated AMQ-5531: ---------------------------------- Description: Hello, It seems that duplex network connector does not get reconnected properly after network failure and causes messages to be lost. Issue is similar to [https://issues.jboss.org/browse/MB-385] and [https://issues.apache.org/jira/browse/AMQ-1973]. Below is my description how to reproduce the issue - I could probably create JUnit test, but I'm unsure how to do this. h3. Setup Lets assume that we have two hosts: hostA (192.168.0.1) and hostB (192.168.0.2). h4. hostA Run brokerA with default configuration, which comes from installation package. h4. hostB Run brokerB either embedded or standalone - I'm using embedded with following code: {code:title=EmbeddedBroker.java} BrokerService broker = new BrokerService(); NetworkConnector connector = broker.addNetworkConnector("static:(failover:tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000)"); connector.setDuplex(true); connector.setConduitSubscriptions(false); {code} h4. Startup In brokerA log you should see following: {{INFO | Connector vm://brokerA started INFO | Started responder end of duplex bridge NC@ID:hostA-34744-1421847312144-0:1 INFO | Network connection between vm://brokerA#0 and tcp:///192.168.0.2:65463@61616 (brokerB) has been established.}} In brokerB log you should see following {{2015-01-21 14:35:12,824 INFO [triggerStartAsyncNetworkBridgeCreation: remoteBroker=unconnected, localBroker= vm://brokerB#0] org.apache.activemq.network.DemandForwardingBridgeSupport - Network connection between vm://brokerB#0 and tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000 (brokerA) has been established.}} h3. Test On hostB run command to simulate network failure: {{iptables -I OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT --reject-with=icmp-host-unreachable}} Wait until you get on brokerB: {{2015-01-21 14:51:52,735 WARN [ActiveMQ InactivityMonitor Worker] org.apache.activemq.transport.failover.FailoverTransport - Transport (tcp://192.168.0.1:61616) failed, reason: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://192.168.0.1:61616, attempting to automatically reconnect}} and on brokerA: {{WARN | Network connection between vm://brokerA#0 and tcp:///192.168.0.2:65463@61616 shutdown due to a remote error: org. pache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://192.168.0.2:65463 INFO | Connector vm://brokerA stopped INFO | brokerA bridge to brokerB stopped}} On hostB simulate network back to normal: {{iptables -D OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT --reject-with=icmp-host-unreachable}} h3. Outcome and summary In brokerB log you should see following: {{2015-01-21 14:52:03,588 INFO [ActiveMQ Task-3] org.apache.activemq.transport.failover.FailoverTransport - Successfully reconnected to tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000}} On brokerA there is no sign of restarting responder end of duplex bridge. Consumers connected to brokerA can't receive messages sent to brokerB. was: Hello, It seems that duplex network connector does not get reconnected properly after network failure and causes messages to be lost. Issue is similar to [https://issues.jboss.org/browse/MB-385] and [https://issues.apache.org/jira/browse/AMQ-1973]. Below is my description how to reproduce the issue - I could probably create JUnit test, but I'm unsure how to do this. h3. Setup Lets assume that we have two hosts: hostA (192.168.0.1) and hostB (192.168.0.2). h4. hostA Run brokerA with default configuration, which comes from installation package. h4. hostB Run brokerB either embedded or standalone - I'm using embedded with following code: {code:title=EmbeddedBroker.java} BrokerService broker = new BrokerService(); NetworkConnector connector = broker.addNetworkConnector("static:(failover:tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000)"); connector.setDuplex(true); connector.setConduitSubscriptions(false); {code} h4. Startup In brokerA log you should see following: {{monospaced}} INFO | Connector vm://brokerA started INFO | Started responder end of duplex bridge NC@ID:hostA-34744-1421847312144-0:1 INFO | Network connection between vm://brokerA#0 and tcp:///192.168.0.2:65463@61616 (brokerB) has been established. {{monospaced}} In brokerB log you should see following {{monospaced}} 2015-01-21 14:35:12,824 INFO [triggerStartAsyncNetworkBridgeCreation: remoteBroker=unconnected, localBroker= vm://brokerB#0] org.apache.activemq.network.DemandForwardingBridgeSupport - Network connection between vm://brokerB#0 and tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000 (brokerA) has been established. {{monospaced}} h3. Test On hostB run command to simulate network failure: {{monospaced}} iptables -I OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT --reject-with=icmp-host-unreachable {{monospaced}} Wait until you get on brokerB: {{monospaced}} 2015-01-21 14:51:52,735 WARN [ActiveMQ InactivityMonitor Worker] org.apache.activemq.transport.failover.FailoverTransport - Transport (tcp://192.168.0.1:61616) failed, reason: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://192.168.0.1:61616, attempting to automatically reconnect {{monospaced}} and on brokerA: {{monospaced}} WARN | Network connection between vm://brokerA#0 and tcp:///192.168.0.2:65463@61616 shutdown due to a remote error: org. pache.activemq.transport.InactivityIOException: Channel was inactive for too (>30000) long: tcp://192.168.0.2:65463 INFO | Connector vm://brokerA stopped INFO | brokerA bridge to brokerB stopped {{monospaced}} On hostB simulate network back to normal: {{monospaced}} iptables -D OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT --reject-with=icmp-host-unreachable {{monospaced}} h3. Outcome and summary In brokerB log you should see following: {{monospaced}} 2015-01-21 14:52:03,588 INFO [ActiveMQ Task-3] org.apache.activemq.transport.failover.FailoverTransport - Successfully reconnected to tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000 {{monospaced}} On brokerA there is no sign of restarting responder end of duplex bridge. Consumers connected to brokerA can't receive messages sent to brokerB. > Duplex NetworkConnector Reconnect > --------------------------------- > > Key: AMQ-5531 > URL: https://issues.apache.org/jira/browse/AMQ-5531 > Project: ActiveMQ > Issue Type: Bug > Components: Broker, Connector, Transport > Affects Versions: 5.9.0, 5.10.0 > Environment: Java(TM) SE Runtime Environment (build 1.7.0_25-b15) > Linux - RHEL 6.x 64-bit > Reporter: Grzegorz Kluczek > > Hello, > It seems that duplex network connector does not get reconnected properly > after network failure and causes messages to be lost. Issue is similar to > [https://issues.jboss.org/browse/MB-385] and > [https://issues.apache.org/jira/browse/AMQ-1973]. > Below is my description how to reproduce the issue - I could probably create > JUnit test, but I'm unsure how to do this. > h3. Setup > Lets assume that we have two hosts: hostA (192.168.0.1) and hostB > (192.168.0.2). > h4. hostA > Run brokerA with default configuration, which comes from installation package. > h4. hostB > Run brokerB either embedded or standalone - I'm using embedded with following > code: > {code:title=EmbeddedBroker.java} > BrokerService broker = new BrokerService(); > NetworkConnector connector = > broker.addNetworkConnector("static:(failover:tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000)"); > connector.setDuplex(true); > connector.setConduitSubscriptions(false); > {code} > h4. Startup > In brokerA log you should see following: > {{INFO | Connector vm://brokerA started > INFO | Started responder end of duplex bridge > NC@ID:hostA-34744-1421847312144-0:1 > INFO | Network connection between vm://brokerA#0 and > tcp:///192.168.0.2:65463@61616 (brokerB) has been established.}} > In brokerB log you should see following > {{2015-01-21 14:35:12,824 INFO [triggerStartAsyncNetworkBridgeCreation: > remoteBroker=unconnected, localBroker= vm://brokerB#0] > org.apache.activemq.network.DemandForwardingBridgeSupport - Network > connection between vm://brokerB#0 and > tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000 (brokerA) > has been established.}} > h3. Test > On hostB run command to simulate network failure: > {{iptables -I OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT > --reject-with=icmp-host-unreachable}} > Wait until you get on brokerB: > {{2015-01-21 14:51:52,735 WARN [ActiveMQ InactivityMonitor Worker] > org.apache.activemq.transport.failover.FailoverTransport - Transport > (tcp://192.168.0.1:61616) failed, reason: > org.apache.activemq.transport.InactivityIOException: Channel was inactive for > too (>30000) long: tcp://192.168.0.1:61616, attempting to automatically > reconnect}} > and on brokerA: > {{WARN | Network connection between vm://brokerA#0 and > tcp:///192.168.0.2:65463@61616 shutdown due to a remote error: org. > pache.activemq.transport.InactivityIOException: Channel was inactive for too > (>30000) long: tcp://192.168.0.2:65463 > INFO | Connector vm://brokerA stopped > INFO | brokerA bridge to brokerB stopped}} > On hostB simulate network back to normal: > {{iptables -D OUTPUT -d 192.168.0.1 -p tcp --dport 61616 -j REJECT > --reject-with=icmp-host-unreachable}} > h3. Outcome and summary > In brokerB log you should see following: > {{2015-01-21 14:52:03,588 INFO [ActiveMQ Task-3] > org.apache.activemq.transport.failover.FailoverTransport - Successfully > reconnected to > tcp://192.168.0.1:61616?wireFormat.maxInactivityDuration=18000000}} > On brokerA there is no sign of restarting responder end of duplex bridge. > Consumers connected to brokerA can't receive messages sent to brokerB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)