[
https://issues.apache.org/activemq/browse/AMQ-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric updated AMQ-2774:
----------------------
Attachment: SocketTstFactory.java
Hi Gary
You will find here a new version of the my own
org.apache.activemq.transport.tcp.SocketTstFactory.java far more deterministic
(no random)
In fact, in this one, the duplex network connection is forcely closed in my
"bagot thread", after 0 ms, then 1 ms, then 2 ms, then ...., then 11 ms, then
31 ms, then 51 ms, then 71 ms, .....
With my patch on the 5.3.2, and my test server, the JUNIT test succeeds (I
think always), With the 5.3.2 version, the JUNIT test doesn't succeed (I think
never).
On my test server (a good one with 2 CPUs), it's when the connection is closed
between 1 and 3 ms, after the connect() call that the network connector is
frozen. If the close connection is done immediately, or after 3 ms, the network
connector continues to live.
I can't test on 5.4-trunk since I don't have a SVN with HTTPS support on my
test server.
I hope this will help you to test on 5.4 and to validate my patch.
Eric-AWL
> Network of brokers : Multicast discovery stopped to work
> --------------------------------------------------------
>
> Key: AMQ-2774
> URL: https://issues.apache.org/activemq/browse/AMQ-2774
> Project: ActiveMQ
> Issue Type: Bug
> Affects Versions: 5.2.0
> Environment: Linux
> Reporter: Eric
> Assignee: Gary Tully
> Fix For: 5.4.1
>
> Attachments: AMQ2774.tar, JMAC-BEA-lastlog.log-20100315,
> SocketTstFactory.java
>
>
> Hi everybody
> I experiment a big problem with the multicast discovery algorithm, in a
> network of brokers topology.
> In some conditions, a broker can't reestablish a distant connection even if
> the distant broker is restarted.
> I have the log traces that would help to identify the origin of the problem.
> When there is no discovery/connection error, I can see these 2 lines in the
> activemq log file
> #08 Jun 2010 14:31:30,639 INFO [Multicast Discovery Agent Notifier]
> org.apache.activemq.network.DiscoveryNetworkConnector
> Establishing network connection between from vm://ACCLU-tpnocp04v to
> tcp://tpnocp09v-bus:13100?useLocalHost=false
> #08 Jun 2010 14:31:30,692 INFO [StartLocalBridge:
> localBroker=vm://ACCLU-tpnocp04v#26]
> org.apache.activemq.network.DemandForwardingBridge
> Network connection between vm://ACCLU-tpnocp04v#26 and
> tcp://tpnocp09v-bus/10.18.126.28:13100(MOM-tpnocp09v) has been established.
> When the connection is broken, I can see this line in the log.
> #11 Jun 2010 12:37:32,585 INFO [Multicast Discovery Agent Notifier]
> org.apache.activemq.network.DemandForwardingBridge
> ACCLU-tpnocp04v bridge to MOM-tpnocp09v stopped
> Then the current ACCLU-tpnocp04v broker tries to reestablish the connection :
> #11 Jun 2010 12:37:34,475 INFO [Multicast Discovery Agent Notifier]
> org.apache.activemq.network.DiscoveryNetworkConnector
> Establishing network connection between from vm://ACCLU-tpnocp04v to
> tcp://tpnocp09v-bus:13100?useLocalHost=false
> But, here, the second line of the log ("has been established") doesn't appear
> in the log file !! I don't know exactly if the connection is up or not.
> Then the connection is broken again (look at "Unknown" instead of
> "MOM-tpnocp09v".
> #11 Jun 2010 13:33:58,655 WARN [ActiveMQ Transport:
> tcp://tpnocp09v-bus/10.18.126.28:13100]
> org.apache.activemq.network.DemandForwardingBridge
> Network connection between vm://ACCLU-tpnocp04v#58 and
> tcp://tpnocp09v-bus/10.18.126.28:13100 shutdown due to a remote error:
> java.net.SocketException: Connection reset
> #11 Jun 2010 13:33:58,657 INFO [NetworkBridge]
> org.apache.activemq.network.DemandForwardingBridge^M
> ACCLU-tpnocp04v bridge to Unknown stopped
> And, now, even if I restart the distant broker ( MOM-tpnocp09v ), no line
> (Establishing/Has been established) appears, and no network connection is
> reestablished between ACCLU-tpnocp04v and MOM-tpnocp09v. it seems that this
> ACCLU-tpnocp04v broker can no longer establish a connection with the
> MOM-tpnocp09v broker !!!
> The production teams tell me that this problem seems not to be resolved in
> fuse-5.3.0.6 version.
> Eric-AWL
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.