Re: Testing failover on dispatcher/java-broker cluster
They seem fair enough and quite related. As a side note, I have a bug with the dispatch router 0.6.1 but I haven't submitted it yet because I haven't reduced the test case yet. In resume, when I connect 2 dispatchers (inter-router) and then delete the connector/listener of "inter-router". If I delete and recreate a mobile address which has received a message on one of the dispatchers, the stats of the "in" and "out" do not reset to 0 when doing "qdstat -a" but they remain at the old values. However they reset correctly on the other router. Have you encountered something similar? Once I have a reduced test case, I will post it in a different thread of course. Regards, Adel From: Ted RossSent: Thursday, September 29, 2016 4:38:26 PM To: users@qpid.apache.org Subject: Re: Testing failover on dispatcher/java-broker cluster Sorry, those Jira numbers and descriptions are mismatched. Here's the correct list: - DISPATCH-496 - Activation of an autolink does not result in issuing credit to a blocked sender - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Topology changes can cause in-flight deliveries to be stuck in the ingress router On 09/29/2016 10:35 AM, Ted Ross wrote: > > On 09/24/2016 05:32 AM, Adel Boutros wrote: >> We are indeed in favor of a minor release as long as the latest >> version is still 0.6.x and we are willing to re-launch our tests and >> give feedback on the release candidate once provided (It shouldn't >> take us more than a day to compile and test). >> Do you have a list of fixes in mind? > > I've identified three fixes that look like good candidates for 0.6.2: > > - DISPATCH-496 - Topology changes can cause in-flight deliveries to >be stuck in the ingress router > - DISPATCH-505 - Eventual loss of credit on inter-router control >links when the topology changes > - DISPATCH-523 - Activation of an autolink does not result in issuing >credit to a blocked sender > > These are all stability-related issues. > > Thoughts? > > -Ted > >> Regards,Adel >> >>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>> To: users@qpid.apache.org >>> From: tr...@redhat.com >>> Date: Fri, 23 Sep 2016 17:23:57 -0400 >>> >>> Hi Adel, >>> >>> A minor release is always possible. It's up to us, the community, to >>> decide whether and when to produce one. I'm in favor of releasing an >>> 0.6.2 with some small backports to fix bugs for users that want to stay >>> on Proton 0.12. >>> >>> -Ted >>> >>> On 09/23/2016 09:44 AM, Adel Boutros wrote: Hello Ted, Did you happen to have the time to check if a minor release is possible? Regards,Adel > From: adelbout...@live.com > To: users@qpid.apache.org > Subject: RE: Testing failover on dispatcher/java-broker cluster > Date: Tue, 20 Sep 2016 15:13:03 +0200 > > Hello Ted, > > I confirm the fix solved the issue. > > Would it be possible to do a 0.6.2 release? We cannot compile newer > versions of Proton (We currently use 0.12.2) due to lack of > resources from our side and we really need this fix for our tests. > > Regards, > Adel > >> Subject: Re: Testing failover on dispatcher/java-broker cluster >> To: users@qpid.apache.org >> From: tr...@redhat.com >> Date: Mon, 19 Sep 2016 12:18:23 -0400 >> >> Hi Adel, >> >> It's a one-liner and it applies cleanly to the 0.6.x branch. >> >> https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407 >> >> -Ted >> >> >> On 09/19/2016 11:41 AM, Adel Boutros wrote: >>> Hello Ted, >>> >>> Antoine is on vacation so I will be taking over this task. >>> >>> Does this fix have any dependencies? We would like to apply it on >>> 0.6.1 without other fixes because it seems the master branch >>> requires proton 0.13.0 minimum whereas we have currently 0.12.2 >>> and we cannot upgrade at the time being. >>> >>> Regards, >>> Adel >>> Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 16 Sep 2016 16:53:05 -0400 Antoine, I think I know what that problem is. I belileve you've stumbled upon this issue: https://issues.apache.org/jira/browse/DISPATCH-496 Your second delivery, the one resulting in a timeout, is causing the inbound link to be blocked (i.e. it has undelivered messages). When the broker reattaches, the blocked links are supposed to become unblocked but they don't in the case of auto-links.
Re: Testing failover on dispatcher/java-broker cluster
Sorry, those Jira numbers and descriptions are mismatched. Here's the correct list: - DISPATCH-496 - Activation of an autolink does not result in issuing credit to a blocked sender - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Topology changes can cause in-flight deliveries to be stuck in the ingress router On 09/29/2016 10:35 AM, Ted Ross wrote: On 09/24/2016 05:32 AM, Adel Boutros wrote: We are indeed in favor of a minor release as long as the latest version is still 0.6.x and we are willing to re-launch our tests and give feedback on the release candidate once provided (It shouldn't take us more than a day to compile and test). Do you have a list of fixes in mind? I've identified three fixes that look like good candidates for 0.6.2: - DISPATCH-496 - Topology changes can cause in-flight deliveries to be stuck in the ingress router - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Activation of an autolink does not result in issuing credit to a blocked sender These are all stability-related issues. Thoughts? -Ted Regards,Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 23 Sep 2016 17:23:57 -0400 Hi Adel, A minor release is always possible. It's up to us, the community, to decide whether and when to produce one. I'm in favor of releasing an 0.6.2 with some small backports to fix bugs for users that want to stay on Proton 0.12. -Ted On 09/23/2016 09:44 AM, Adel Boutros wrote: Hello Ted, Did you happen to have the time to check if a minor release is possible? Regards,Adel From: adelbout...@live.com To: users@qpid.apache.org Subject: RE: Testing failover on dispatcher/java-broker cluster Date: Tue, 20 Sep 2016 15:13:03 +0200 Hello Ted, I confirm the fix solved the issue. Would it be possible to do a 0.6.2 release? We cannot compile newer versions of Proton (We currently use 0.12.2) due to lack of resources from our side and we really need this fix for our tests. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Mon, 19 Sep 2016 12:18:23 -0400 Hi Adel, It's a one-liner and it applies cleanly to the 0.6.x branch. https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407 -Ted On 09/19/2016 11:41 AM, Adel Boutros wrote: Hello Ted, Antoine is on vacation so I will be taking over this task. Does this fix have any dependencies? We would like to apply it on 0.6.1 without other fixes because it seems the master branch requires proton 0.13.0 minimum whereas we have currently 0.12.2 and we cannot upgrade at the time being. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 16 Sep 2016 16:53:05 -0400 Antoine, I think I know what that problem is. I belileve you've stumbled upon this issue: https://issues.apache.org/jira/browse/DISPATCH-496 Your second delivery, the one resulting in a timeout, is causing the inbound link to be blocked (i.e. it has undelivered messages). When the broker reattaches, the blocked links are supposed to become unblocked but they don't in the case of auto-links. This has been fixed on the master branch if you'd like to try applying the patch. -Ted On 09/15/2016 04:56 AM, Antoine Chevin wrote: Hi Ted, You’re right, the connection close looked strange before stopping of the broker. I manually added the annotation (# stopping the broker) and was wrong about the position of this one. I replayed the test and the connection close happens *after* the broker stop. I assume it is the broker that initiates it. I found something interesting. In my test, I always sent a message when the broker is down, expecting to get a JmsSendTimedOutException (waiting for the disposition frame). I assumed this was harmless. But it turns out this is not. When I don’t do that, I can send a message after the broker restart. So to sum up the experiment I did: * I use Wireshark between the JMS client and the dispatcher. * 1) Using JMS I establish a connection to the dispatcher and create a message producer (Wireshark: connection open -> attach) 2) I’m able to send a message to the broker through the dispatcher ( Wireshark: transfer -> disposition) 3) I stop the broker 4) With the same link, I send a message and I get a JmsSendTimedOutException (waiting for the disposition frame) (Wireshark: transfer) 5) I restart the broker 6) With the same link, I try to send a message and I get a JmsSendTimedOutException for the same reason (waiting for the disposition frame) (Wireshark: transfer) If I skip step (4), I cannot reproduce
Re: Testing failover on dispatcher/java-broker cluster
On 09/29/2016 10:47 AM, Adel Boutros wrote: They seem fair enough and quite related. As a side note, I have a bug with the dispatch router 0.6.1 but I haven't submitted it yet because I haven't reduced the test case yet. In resume, when I connect 2 dispatchers (inter-router) and then delete the connector/listener of "inter-router". If I delete and recreate a mobile address which has received a message on one of the dispatchers, the stats of the "in" and "out" do not reset to 0 when doing "qdstat -a" but they remain at the old values. However they reset correctly on the other router. What exactly do you mean by "delete and recreate a mobile address"? If an address is removed from the table, the next time it appears, a new record will be created for that address. The new record will have zeroed statistics. What behavior are you expecting? Have you encountered something similar? Once I have a reduced test case, I will post it in a different thread of course. Regards, Adel From: Ted RossSent: Thursday, September 29, 2016 4:38:26 PM To: users@qpid.apache.org Subject: Re: Testing failover on dispatcher/java-broker cluster Sorry, those Jira numbers and descriptions are mismatched. Here's the correct list: - DISPATCH-496 - Activation of an autolink does not result in issuing credit to a blocked sender - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Topology changes can cause in-flight deliveries to be stuck in the ingress router On 09/29/2016 10:35 AM, Ted Ross wrote: On 09/24/2016 05:32 AM, Adel Boutros wrote: We are indeed in favor of a minor release as long as the latest version is still 0.6.x and we are willing to re-launch our tests and give feedback on the release candidate once provided (It shouldn't take us more than a day to compile and test). Do you have a list of fixes in mind? I've identified three fixes that look like good candidates for 0.6.2: - DISPATCH-496 - Topology changes can cause in-flight deliveries to be stuck in the ingress router - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Activation of an autolink does not result in issuing credit to a blocked sender These are all stability-related issues. Thoughts? -Ted Regards,Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 23 Sep 2016 17:23:57 -0400 Hi Adel, A minor release is always possible. It's up to us, the community, to decide whether and when to produce one. I'm in favor of releasing an 0.6.2 with some small backports to fix bugs for users that want to stay on Proton 0.12. -Ted On 09/23/2016 09:44 AM, Adel Boutros wrote: Hello Ted, Did you happen to have the time to check if a minor release is possible? Regards,Adel From: adelbout...@live.com To: users@qpid.apache.org Subject: RE: Testing failover on dispatcher/java-broker cluster Date: Tue, 20 Sep 2016 15:13:03 +0200 Hello Ted, I confirm the fix solved the issue. Would it be possible to do a 0.6.2 release? We cannot compile newer versions of Proton (We currently use 0.12.2) due to lack of resources from our side and we really need this fix for our tests. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Mon, 19 Sep 2016 12:18:23 -0400 Hi Adel, It's a one-liner and it applies cleanly to the 0.6.x branch. https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407 -Ted On 09/19/2016 11:41 AM, Adel Boutros wrote: Hello Ted, Antoine is on vacation so I will be taking over this task. Does this fix have any dependencies? We would like to apply it on 0.6.1 without other fixes because it seems the master branch requires proton 0.13.0 minimum whereas we have currently 0.12.2 and we cannot upgrade at the time being. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 16 Sep 2016 16:53:05 -0400 Antoine, I think I know what that problem is. I belileve you've stumbled upon this issue: https://issues.apache.org/jira/browse/DISPATCH-496 Your second delivery, the one resulting in a timeout, is causing the inbound link to be blocked (i.e. it has undelivered messages). When the broker reattaches, the blocked links are supposed to become unblocked but they don't in the case of auto-links. This has been fixed on the master branch if you'd like to try applying the patch. -Ted On 09/15/2016 04:56 AM, Antoine Chevin wrote: Hi Ted, You’re right, the connection close looked strange before stopping of the broker. I manually added the annotation (# stopping the
link delivery priority
Hi, I'm trying to set up two concurrent receiver links on the same broker: one to listen to messages from a queue (e.g. `test.queue`) as a consumer/worker, another to listen to the same queue but with a distribution mode of `copy` in order to passively track some information without corrupting the behavior of the first consumer. Is this possible? It seems from preliminary testing that I can't guarantee the order that these links are created, and therefore if the first receiver picks up the message it will never be delivered to the passive copy receiver. I understand this is sort of a contrived scenario, but I'm not seeing in the spec how I otherwise might give priority to a given queue. Perhaps qpidd has some internal means of doing this? Regards, Matt
Re: Testing failover on dispatcher/java-broker cluster
On 09/24/2016 05:32 AM, Adel Boutros wrote: We are indeed in favor of a minor release as long as the latest version is still 0.6.x and we are willing to re-launch our tests and give feedback on the release candidate once provided (It shouldn't take us more than a day to compile and test). Do you have a list of fixes in mind? I've identified three fixes that look like good candidates for 0.6.2: - DISPATCH-496 - Topology changes can cause in-flight deliveries to be stuck in the ingress router - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Activation of an autolink does not result in issuing credit to a blocked sender These are all stability-related issues. Thoughts? -Ted Regards,Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 23 Sep 2016 17:23:57 -0400 Hi Adel, A minor release is always possible. It's up to us, the community, to decide whether and when to produce one. I'm in favor of releasing an 0.6.2 with some small backports to fix bugs for users that want to stay on Proton 0.12. -Ted On 09/23/2016 09:44 AM, Adel Boutros wrote: Hello Ted, Did you happen to have the time to check if a minor release is possible? Regards,Adel From: adelbout...@live.com To: users@qpid.apache.org Subject: RE: Testing failover on dispatcher/java-broker cluster Date: Tue, 20 Sep 2016 15:13:03 +0200 Hello Ted, I confirm the fix solved the issue. Would it be possible to do a 0.6.2 release? We cannot compile newer versions of Proton (We currently use 0.12.2) due to lack of resources from our side and we really need this fix for our tests. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Mon, 19 Sep 2016 12:18:23 -0400 Hi Adel, It's a one-liner and it applies cleanly to the 0.6.x branch. https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407 -Ted On 09/19/2016 11:41 AM, Adel Boutros wrote: Hello Ted, Antoine is on vacation so I will be taking over this task. Does this fix have any dependencies? We would like to apply it on 0.6.1 without other fixes because it seems the master branch requires proton 0.13.0 minimum whereas we have currently 0.12.2 and we cannot upgrade at the time being. Regards, Adel Subject: Re: Testing failover on dispatcher/java-broker cluster To: users@qpid.apache.org From: tr...@redhat.com Date: Fri, 16 Sep 2016 16:53:05 -0400 Antoine, I think I know what that problem is. I belileve you've stumbled upon this issue: https://issues.apache.org/jira/browse/DISPATCH-496 Your second delivery, the one resulting in a timeout, is causing the inbound link to be blocked (i.e. it has undelivered messages). When the broker reattaches, the blocked links are supposed to become unblocked but they don't in the case of auto-links. This has been fixed on the master branch if you'd like to try applying the patch. -Ted On 09/15/2016 04:56 AM, Antoine Chevin wrote: Hi Ted, You’re right, the connection close looked strange before stopping of the broker. I manually added the annotation (# stopping the broker) and was wrong about the position of this one. I replayed the test and the connection close happens *after* the broker stop. I assume it is the broker that initiates it. I found something interesting. In my test, I always sent a message when the broker is down, expecting to get a JmsSendTimedOutException (waiting for the disposition frame). I assumed this was harmless. But it turns out this is not. When I don’t do that, I can send a message after the broker restart. So to sum up the experiment I did: * I use Wireshark between the JMS client and the dispatcher. * 1) Using JMS I establish a connection to the dispatcher and create a message producer (Wireshark: connection open -> attach) 2) I’m able to send a message to the broker through the dispatcher ( Wireshark: transfer -> disposition) 3) I stop the broker 4) With the same link, I send a message and I get a JmsSendTimedOutException (waiting for the disposition frame) (Wireshark: transfer) 5) I restart the broker 6) With the same link, I try to send a message and I get a JmsSendTimedOutException for the same reason (waiting for the disposition frame) (Wireshark: transfer) If I skip step (4), I cannot reproduce step (6) and my messages arrive (Wireshark: transfer -> disposition) to the restarted broker. I hope it makes it clearer for you. Sorry for my rookie mistakes :-). Note: My colleague and I ran a small experiment to identify if the problem comes from JMS or the AMQP protocol. He changed the code of the java broker to not send the disposition frame one time out of two. We got these results: * I use Wireshark between the JMS client and the patched broker. * 1) Using JMS I
Re: link delivery priority
On Thu, Sep 29, 2016 at 5:16 PM, Gordon Simwrote: > On 29/09/16 15:43, Matt Broadstone wrote: > >> I'm trying to set up two concurrent receiver links on the same broker: >> one >> to listen to messages from a queue (e.g. `test.queue`) as a >> consumer/worker, another to listen to the same queue but with a >> distribution mode of `copy` in order to passively track some information >> without corrupting the behavior of the first consumer. Is this possible? >> >> It seems from preliminary testing that I can't guarantee the order that >> these links are created, and therefore if the first receiver picks up the >> message it will never be delivered to the passive copy receiver. I >> understand this is sort of a contrived scenario, but I'm not seeing in the >> spec how I otherwise might give priority to a given queue. Perhaps qpidd >> has some internal means of doing this? >> > > Would it be possible to have messages that are published to test.queue go > through an exchange, and then bind another queue to that exchange as well > which would then contain the same set of messages? > > That's exactly what I ended up doing :) Glad to see I wasn't on the complete wrong path here. It did see pretty obvious that what I was asking for sounded a lot more like an exchange. Thanks for the response Gordon, sorry for the noise Matt > > - > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org > For additional commands, e-mail: users-h...@qpid.apache.org > >
Re: link delivery priority
On 29/09/16 15:43, Matt Broadstone wrote: I'm trying to set up two concurrent receiver links on the same broker: one to listen to messages from a queue (e.g. `test.queue`) as a consumer/worker, another to listen to the same queue but with a distribution mode of `copy` in order to passively track some information without corrupting the behavior of the first consumer. Is this possible? It seems from preliminary testing that I can't guarantee the order that these links are created, and therefore if the first receiver picks up the message it will never be delivered to the passive copy receiver. I understand this is sort of a contrived scenario, but I'm not seeing in the spec how I otherwise might give priority to a given queue. Perhaps qpidd has some internal means of doing this? Would it be possible to have messages that are published to test.queue go through an exchange, and then bind another queue to that exchange as well which would then contain the same set of messages? - To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org For additional commands, e-mail: users-h...@qpid.apache.org