On Wed, Jan 19, 2011 at 6:17 AM, Alan Conway <[email protected]> wrote: > On 01/18/2011 08:04 PM, Mark Moseley wrote: >> >> On Tue, Jan 18, 2011 at 12:53 PM, Alan Conway<[email protected]> wrote: >>> >>> On 01/10/2011 09:12 AM, Alan Conway wrote: >>>> >>>> On 01/07/2011 07:55 PM, Mark Moseley wrote: >>>>> >>>>> On Thu, Jan 6, 2011 at 12:47 PM, Alan Conway<[email protected]> >>>>> wrote: >>>>>> >>>>>> On 12/29/2010 02:11 PM, Mark Moseley wrote: >>>>>>> >>>>>>> This might be the same as >>>>>>> https://issues.apache.org/jira/browse/QPID-2982 but in case it's not, >>>>>>> I'm dropping this email. If I connect to qpid-tool on member A of a >>>>>>> cluster and do just about anything, e.g. list binding, list exchange, >>>>>>> etc, the other node, B, blows up. In the logs below, exp01==A and >>>>>>> exp02==B. >>>>>>> [snip] >>>>> >>>>> I've commented on that JIRA. I hope my info is useful. It's getting >>>>> kind of convoluted :) >>>> >>>> Thanks, I'll try it out and see if I can reproduce it. It will be very >>>> helpful >>>> if I can. >>>> >>> >>> I believe I've fixed https://issues.apache.org/jira/browse/QPID-2982 on >>> trunk r1060568. Can you give it a spin and let me know how it goes? >> >> Just started testing a little while ago but so far I haven't seen a >> single crash yet using the same steps I posted in the JIRA, so it >> looks pretty good so far. I'll post again if I see any crashes. >> > > That's good. Can you also re-test 2992 and 2993? I think they may also be > fixed by this patch.
No dice on 2992 and 2993. They both still have the same issue. And for 2993, it still can kill off a cluster node. In the 2993 case, if I've done a restart of B1/B2 and the federated route is gone when they come back up, when I go to add it back on B1, it fairly regularly kills B1 with this: 2011-01-19 13:58:36 debug cluster(201.0.0.0:7701 READY) replicated connection HOSTA1:5672(202.0.0.0:18335-1 shadow) 2011-01-19 13:58:38 debug Exception constructed: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) 2011-01-19 13:58:38 error Channel exception: not-attached: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) 2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) channel error 710 on HOSTA1:5672(202.0.0.0:18335-1 shadow) must be resolved with: 201.0.0.0:7701 202.0.0.0:18335 : not-attached: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) 2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) error 710 resolved with 201.0.0.0:7701 2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) error 710 must be resolved with 202.0.0.0:18335 2011-01-19 13:58:38 critical cluster(201.0.0.0:7701 READY/error) local error 710 did not occur on member 202.0.0.0:18335: not-attached: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) 2011-01-19 13:58:38 debug Exception constructed: local error did not occur on all cluster members : not-attached: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) (qpid/cluster/ErrorCheck.cpp:89) 2011-01-19 13:58:38 critical Error delivering frames: local error did not occur on all cluster members : not-attached: Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39) (qpid/cluster/ErrorCheck.cpp:89) 2011-01-19 13:58:38 notice cluster(201.0.0.0:7701 LEFT/error) leaving cluster bosclust 2011-01-19 13:58:38 debug SEND raiseEvent (v1) class=org.apache.qpid.broker.clientDisconnect 2011-01-19 13:58:38 debug DISCONNECTED [10.1.58.3:41680] 2011-01-19 13:58:38 debug SEND raiseEvent (v1) class=org.apache.qpid.broker.clientDisconnect 2011-01-19 13:58:38 debug Shutting down CPG 2011-01-19 13:58:38 notice Shut down 2011-01-19 13:58:38 debug Journal "bosmyq1": Destroyed 2011-01-19 13:58:38 debug Journal "TplStore": Destroyed For 2992, the route doesn't reappear but I haven't seen it kill a cluster node yet, only in the 2993 case. --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:[email protected]
