Hello Ted,
I confirm all my tests are GREEN at head of 0.6.x branch. For reference: Qpid Java Broker: 6.0.4 Qpid Proton: 0.12.2 Compiler: gcc 4.9.1 OS: Linux Red Hat Regards, Adel ________________________________ From: Adel Boutros <[email protected]> Sent: Friday, September 30, 2016 3:07:56 PM To: [email protected] Subject: Re: Testing failover on dispatcher/java-broker cluster Great! I have synched your changes and we will run my tests. I will get back to you with the results as soon as possible. Regards, Adel ________________________________ From: Ted Ross <[email protected]> Sent: Friday, September 30, 2016 2:39:51 PM To: [email protected] Subject: Re: Testing failover on dispatcher/java-broker cluster Done. I've pushed the four cherry-picked commits to the 0.6.x branch if you'd like to give it a go. -Ted On 09/30/2016 05:47 AM, Adel Boutros wrote: > Hello Ted, > > > Following discussions here > (http://qpid.2158936.n2.nabble.com/Dispatch-router-0-6-1-Configuration-bugs-td7651334.html), > can DISPATCH-500 be included in the minor release? > > > PS: It still hasn't solved my below issue but I will continue the analysis on > the other thread > > > Regards, > > Adel > > Apache Qpid users - [Dispatch router 0.6.1] Configuration > bugs<http://qpid.2158936.n2.nabble.com/Dispatch-router-0-6-1-Configuration-bugs-td7651334.html> > qpid.2158936.n2.nabble.com > [Dispatch router 0.6.1] Configuration bugs. Hello, As a follow up to my > previous thread, I am having some issues with the dispatch router. I will > start with the first one here: It seems the... > > > ________________________________ > From: Adel Boutros <[email protected]> > Sent: Thursday, September 29, 2016 5:01:45 PM > To: [email protected] > Subject: Re: Testing failover on dispatcher/java-broker cluster > > I would expect what you have described however it doesn't seem to be the case. > > > delete/recreate mobile address: > > qdmanage -b amqp://localhost:10501 delete --type=address --name > haProxy.queue.addr > qdmanage -b amqp://localhost:10501 create --type=address prefix=haProxy.queue > waypoint=true name=haProxy.queue.addr > > The stats remain at a positive value (10 10). If I restart the dispatchers > without the inter-router connection, I don't have the issue. > > Router Addresses > class addr phs distrib in-proc local > remote cntnr in out thru to-proc from-proc > > ================================================================================== > mobile haProxy.queue 1 balanced 0 0 0 > 0 0 0 0 0 0 > mobile haProxy.queue 0 balanced 0 1 0 > 0 10 10 0 0 0 > > > Adel > > ________________________________ > From: Ted Ross <[email protected]> > Sent: Thursday, September 29, 2016 4:55 PM > To: [email protected] > Subject: Re: Testing failover on dispatcher/java-broker cluster > > > > On 09/29/2016 10:47 AM, Adel Boutros wrote: >> They seem fair enough and quite related. >> >> >> As a side note, I have a bug with the dispatch router 0.6.1 but I haven't >> submitted it yet because I haven't reduced the test case yet. >> >> In resume, when I connect 2 dispatchers (inter-router) and then delete the >> connector/listener of "inter-router". If I delete and recreate a mobile >> address which has received a message on one of the dispatchers, the stats of >> the "in" and "out" do not reset to 0 when doing "qdstat -a" but they remain >> at the old values. However they reset correctly on the other router. > > What exactly do you mean by "delete and recreate a mobile address"? > > If an address is removed from the table, the next time it appears, a new > record will be created for that address. The new record will have > zeroed statistics. What behavior are you expecting? > >> >> >> Have you encountered something similar? Once I have a reduced test case, I >> will post it in a different thread of course. >> >> >> Regards, >> >> Adel >> >> ________________________________ >> From: Ted Ross <[email protected]> >> Sent: Thursday, September 29, 2016 4:38:26 PM >> To: [email protected] >> Subject: Re: Testing failover on dispatcher/java-broker cluster >> >> Sorry, those Jira numbers and descriptions are mismatched. Here's the >> correct list: >> >> - DISPATCH-496 - Activation of an autolink does not result in issuing >> credit to a blocked sender >> - DISPATCH-505 - Eventual loss of credit on inter-router control >> links when the topology changes >> - DISPATCH-523 - Topology changes can cause in-flight deliveries to >> be stuck in the ingress router >> >> >> On 09/29/2016 10:35 AM, Ted Ross wrote: >>> >>> On 09/24/2016 05:32 AM, Adel Boutros wrote: >>>> We are indeed in favor of a minor release as long as the latest >>>> version is still 0.6.x and we are willing to re-launch our tests and >>>> give feedback on the release candidate once provided (It shouldn't >>>> take us more than a day to compile and test). >>>> Do you have a list of fixes in mind? >>> >>> I've identified three fixes that look like good candidates for 0.6.2: >>> >>> - DISPATCH-496 - Topology changes can cause in-flight deliveries to >>> be stuck in the ingress router >>> - DISPATCH-505 - Eventual loss of credit on inter-router control >>> links when the topology changes >>> - DISPATCH-523 - Activation of an autolink does not result in issuing >>> credit to a blocked sender >>> >>> These are all stability-related issues. >>> >>> Thoughts? >>> >>> -Ted >>> >>>> Regards,Adel >>>> >>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>>>> To: [email protected] >>>>> From: [email protected] >>>>> Date: Fri, 23 Sep 2016 17:23:57 -0400 >>>>> >>>>> Hi Adel, >>>>> >>>>> A minor release is always possible. It's up to us, the community, to >>>>> decide whether and when to produce one. I'm in favor of releasing an >>>>> 0.6.2 with some small backports to fix bugs for users that want to stay >>>>> on Proton 0.12. >>>>> >>>>> -Ted >>>>> >>>>> On 09/23/2016 09:44 AM, Adel Boutros wrote: >>>>>> Hello Ted, >>>>>> Did you happen to have the time to check if a minor release is >>>>>> possible? >>>>>> Regards,Adel >>>>>> >>>>>>> From: [email protected] >>>>>>> To: [email protected] >>>>>>> Subject: RE: Testing failover on dispatcher/java-broker cluster >>>>>>> Date: Tue, 20 Sep 2016 15:13:03 +0200 >>>>>>> >>>>>>> Hello Ted, >>>>>>> >>>>>>> I confirm the fix solved the issue. >>>>>>> >>>>>>> Would it be possible to do a 0.6.2 release? We cannot compile newer >>>>>>> versions of Proton (We currently use 0.12.2) due to lack of >>>>>>> resources from our side and we really need this fix for our tests. >>>>>>> >>>>>>> Regards, >>>>>>> Adel >>>>>>> >>>>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>>>>>>> To: [email protected] >>>>>>>> From: [email protected] >>>>>>>> Date: Mon, 19 Sep 2016 12:18:23 -0400 >>>>>>>> >>>>>>>> Hi Adel, >>>>>>>> >>>>>>>> It's a one-liner and it applies cleanly to the 0.6.x branch. >>>>>>>> >>>>>>>> https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407 >>>>>>>> >>>>>>>> -Ted >>>>>>>> >>>>>>>> >>>>>>>> On 09/19/2016 11:41 AM, Adel Boutros wrote: >>>>>>>>> Hello Ted, >>>>>>>>> >>>>>>>>> Antoine is on vacation so I will be taking over this task. >>>>>>>>> >>>>>>>>> Does this fix have any dependencies? We would like to apply it on >>>>>>>>> 0.6.1 without other fixes because it seems the master branch >>>>>>>>> requires proton 0.13.0 minimum whereas we have currently 0.12.2 >>>>>>>>> and we cannot upgrade at the time being. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Adel >>>>>>>>> >>>>>>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>>>>>>>>> To: [email protected] >>>>>>>>>> From: [email protected] >>>>>>>>>> Date: Fri, 16 Sep 2016 16:53:05 -0400 >>>>>>>>>> >>>>>>>>>> Antoine, >>>>>>>>>> >>>>>>>>>> I think I know what that problem is. I belileve you've stumbled >>>>>>>>>> upon >>>>>>>>>> this issue: >>>>>>>>>> >>>>>>>>>> https://issues.apache.org/jira/browse/DISPATCH-496 >>>>>>>>>> >>>>>>>>>> Your second delivery, the one resulting in a timeout, is causing >>>>>>>>>> the >>>>>>>>>> inbound link to be blocked (i.e. it has undelivered messages). >>>>>>>>>> When the >>>>>>>>>> broker reattaches, the blocked links are supposed to become >>>>>>>>>> unblocked >>>>>>>>>> but they don't in the case of auto-links. >>>>>>>>>> >>>>>>>>>> This has been fixed on the master branch if you'd like to try >>>>>>>>>> applying >>>>>>>>>> the patch. >>>>>>>>>> >>>>>>>>>> -Ted >>>>>>>>>> >>>>>>>>>> On 09/15/2016 04:56 AM, Antoine Chevin wrote: >>>>>>>>>>> Hi Ted, >>>>>>>>>>> >>>>>>>>>>> You’re right, the connection close looked strange before >>>>>>>>>>> stopping of the >>>>>>>>>>> broker. I manually added the annotation (# stopping the broker) >>>>>>>>>>> and was >>>>>>>>>>> wrong about the position of this one. I replayed the test and the >>>>>>>>>>> connection close happens *after* the broker stop. I assume it >>>>>>>>>>> is the broker >>>>>>>>>>> that initiates it. >>>>>>>>>>> >>>>>>>>>>> I found something interesting. In my test, I always sent a >>>>>>>>>>> message when the >>>>>>>>>>> broker is down, expecting to get a JmsSendTimedOutException >>>>>>>>>>> (waiting for >>>>>>>>>>> the disposition frame). I assumed this was harmless. But it >>>>>>>>>>> turns out this >>>>>>>>>>> is not. When I don’t do that, I can send a message after the >>>>>>>>>>> broker >>>>>>>>>>> restart. So to sum up the experiment I did: >>>>>>>>>>> >>>>>>>>>>> * I use Wireshark between the JMS client and the dispatcher. * >>>>>>>>>>> >>>>>>>>>>> 1) Using JMS I establish a connection to the dispatcher >>>>>>>>>>> and create a >>>>>>>>>>> message producer (Wireshark: connection open -> attach) >>>>>>>>>>> 2) I’m able to send a message to the broker through the >>>>>>>>>>> dispatcher ( >>>>>>>>>>> Wireshark: transfer -> disposition) >>>>>>>>>>> 3) I stop the broker >>>>>>>>>>> 4) With the same link, I send a message and I get a >>>>>>>>>>> JmsSendTimedOutException (waiting for the disposition frame) >>>>>>>>>>> (Wireshark: >>>>>>>>>>> transfer) >>>>>>>>>>> 5) I restart the broker >>>>>>>>>>> 6) With the same link, I try to send a message and I get a >>>>>>>>>>> JmsSendTimedOutException for the same reason (waiting for the >>>>>>>>>>> disposition >>>>>>>>>>> frame) (Wireshark: transfer) >>>>>>>>>>> >>>>>>>>>>> If I skip step (4), I cannot reproduce step (6) and my messages >>>>>>>>>>> arrive >>>>>>>>>>> (Wireshark: transfer -> disposition) to the restarted broker. >>>>>>>>>>> >>>>>>>>>>> I hope it makes it clearer for you. Sorry for my rookie >>>>>>>>>>> mistakes :-). >>>>>>>>>>> >>>>>>>>>>> Note: My colleague and I ran a small experiment to identify if >>>>>>>>>>> the problem >>>>>>>>>>> comes from JMS or the AMQP protocol. He changed the code of the >>>>>>>>>>> java broker >>>>>>>>>>> to not send the disposition frame one time out of two. >>>>>>>>>>> >>>>>>>>>>> We got these results: >>>>>>>>>>> >>>>>>>>>>> * I use Wireshark between the JMS client and the patched broker. * >>>>>>>>>>> >>>>>>>>>>> 1) Using JMS I establish a connection to the patched broker and >>>>>>>>>>> create a >>>>>>>>>>> message producer (Wireshark: connection open -> attach) >>>>>>>>>>> 2) I send a message to the broker and it replies with the >>>>>>>>>>> disposition >>>>>>>>>>> frame (Wireshark: transfer -> disposition) >>>>>>>>>>> 3) I send a message to the broker which drops the disposition >>>>>>>>>>> frame. I get >>>>>>>>>>> a send timeout in JMS (Wireshark: transfer) >>>>>>>>>>> 2) I send a message to the broker and it replies with the >>>>>>>>>>> disposition frame >>>>>>>>>>> (Wireshark: transfer -> disposition). It works fine. >>>>>>>>>>> >>>>>>>>>>> We assume that there is something going on in the dispatcher. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Antoine >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> >>>>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [email protected] >>>>>>>> For additional commands, e-mail: [email protected] >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: [email protected] >>>>> For additional commands, e-mail: [email protected] >>>>> >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
