Hi Mike, On Mon, 6 Aug 2018 at 18:23, Dyslin, Mike <[email protected]> wrote:
> Rob, > > I created JIRA QPID-8225 with Title "Java Broker (7.0.6) stops delivering > queue/consumer messages after 4 GB data transfer". > > The log you requested is attached to the JIRA. > > The log contains the connection startup, 10 messages flowing (2 every 10 > seconds), then connection close. It does not contain when message flow > stopped (credit exhausted?). I can get that for you if you like. > > We did see this in the log: > > 2018-08-06 07:57:34,100 DEBUG [IO-/15.252.32.148:43266] > (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 > MessageSetFlowMode(destination=dsaqp15, flowMode=CREDIT) > 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266] > (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 > MessageFlow(destination=dsaqp15, unit=MESSAGE, value=4294967295) > 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266] > (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1 > MessageFlow(destination=dsaqp15, unit=BYTE, value=4294967295) > > And we never saw anything that looked like credit was being reset by the > consumer. > > OK - this actually matches up with what I suspected looking more deeply at the code this morning. The value 4294967295 is supposed to be a magic number meaning "infinite" credit, however the Java Broker is mishandling this, expecting it to be passed as -1, whereas in fact a lower layer in the code is instead passing the value as 4294967295 in a signed 64bit value. I just committed a fix to master, but I'm not sure when that might be released as a new version. > So, perhaps the 4 GB credit simply is all used up and messages stop > flowing. > > How do we change FlowMode to OFF, or if that is not possible, how do we > reset our credit back to 4 GB in the QPID C++ API? > > So, I'm not familiar with the C++ API, if there were an obvious way to switch from "credit" mode to "window" mode (which is the protocol default) then that would likely work around the problem. > Since we don't see any of our code setting any kind of flow control > option, is the C++ API sending flow control default setting to the broker, > or is the broker setting flow control default because the producer has not > specified any flow control? If the latter, then perhaps the C++ and Java > brokers have different flow control defaults. If the former, then C++ > broker and the java broker are not implementing specified the flow control > in the same way. > > Again, our original code was only tested with C++ brokers. > > Yeah - sadly this is a Java Broker bug you've discovered. I've uploaded a jar file with the proposed fix into the JIRA (just replace the existing jar of the same name in lib/broker-plugins). Alternatively you can build this yourself by doing the following: git clone https://git-wip-us.apache.org/repos/asf/qpid-broker-j.git cd qpid-broker-j/ git checkout 7.0.6 git cherry-pick cf40fdea39d9633702ee286d94e950a19ec7be74 mvn package Apologies you've run into this :-( I think most clients are using window, or non-infinite credit - that's the only reason I can think that this has not been spotted before. If you give the jar a try and it works (or doesn't work) for you, please let us know. -- Rob Thanks, > Mike > > > -----Original Message----- > From: Rob Godfrey [mailto:[email protected]] > Sent: Friday, August 3, 2018 3:34 PM > To: [email protected] > Cc: Mears, David B <[email protected]>; Herren, Elaine < > [email protected]>; Rao, Shobha (NonStop) <[email protected] > > > Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages > after 4 GB data transfer > > On Sat, 4 Aug 2018 at 00:11, Dyslin, Mike <[email protected]> wrote: > > > Rob, > > > > We do not know how our consumer is managing credit, or anything about > > credit flow, but have started looking for information on this. It > > sounds like the right place to look. If we could figure out how to > > turn off the flow control limits, that may do it. We inherited most > > of this code, and the original authors of the code are long gone, so > > we don't have the experience of putting it all together. Perhaps the > > defaults for this credit flow is different between the C++ and Java > brokers. > > > > OK, we'll try to figure out how to get protocol logging turned on and > > get a log file to look at. > > > > Thanks - I had a quick look at the Java Broker code for managing credit > and didn't immediately see any obvious errors that would hit a properly > functioning client, *however* it does look like that if the client is > improperly managing credit there would be an overflow error where a value > that should be an unsigned integral value (the amount of remaining credit) > will turn negative. The broker is storing the amount of outstanding credit > as a (signed) Java long value. AMQP 0-10 defines the message.flow command > as "This command controls the flow of message data to a given destination. > It is used by the recipient of messages to dynamically match the incoming > rate of message flow to its processing or forwarding capacity. Upon receipt > of this command, the sender must add "value" number of the specified unit > to the available credit balance for the specified destination." So if the > client is repeatedly adding more credit than it needs the stored "limit" in > the broker might overflow and turn negative. At this point the broker > would stop sending messages. > > -- Rob > > > > > > > Thanks, > > Mike > > > > -----Original Message----- > > From: Rob Godfrey [mailto:[email protected]] > > Sent: Friday, August 3, 2018 11:11 AM > > To: [email protected] > > Cc: Mears, David B <[email protected]>; Herren, Elaine < > > [email protected]>; Rao, Shobha (NonStop) > > <[email protected] > > > > > Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer > > messages after 4 GB data transfer > > > > On Fri, 3 Aug 2018 at 19:18, Dyslin, Mike <[email protected]> wrote: > > > > > Rob, > > > > > > Our consumer client version is 1.37: > > > > > > bash 0 3: rpm -qa | grep -i qpid > > > qpid-cpp-client-devel-1.37.0-1.el7.x86_64 > > > python-qpid-1.37.0-1.el7.noarch > > > qpid-qmf-1.37.0-1.el7.x86_64 > > > qpid-cpp-client-1.37.0-1.el7.x86_64 > > > python-qpid-qmf-1.37.0-1.el7.x86_64 > > > qpid-cpp-server-1.37.0-1.el7.x86_64 > > > qpid-proton-c-0.18.1-1.el7.x86_64 > > > qpid-tools-1.37.0-1.el7.noarch > > > > > > Our producer client version is 6.1.5: > > > -rw-r--r-- 1 NSDA.NSDA NSDA 570873 Jul 26 15:42 > > > qpid-client-6.1.5.jar > > > -rw-r--r-- 1 NSDA.NSDA NSDA 864493 Jul 26 15:42 > > > qpid-common-6.1.5.jar > > > > > > Both producer and consumer clients use AMQP_0_10 protocol. > > > > > > > OK - so my assumption here is that there is some issue in the > > management of credit in the broker (or possibly in the client). AMQP > > 0-10 has two distinct credit flow modes "credit" and "window" and also > > allows the consumer to separately set limits for both "message" and > "byte" credit. > > I'm not very familiar with the C++ API, but do you know how your > > consumer is managing credit? > > > > One thing that would be very helpful in trying to diagnose this > > problem is getting protocol logging for the consumer (at least for the > > start of the consumer where it sets up the credit flow mode, and > > towards the end where it would be interesting to see the credit being > > allocated just before message flow stops). > > > > -- Rob > > > > > > > > > > Thanks for your assistance, > > > Mike > > > > > > FYI - There may be a typo on the past releases web page. I believe > > "2017" > > > should be "2018" in "Qpid JMS AMQP 0-x 6.3.2, July 2017". URL: > > > https://qpid.apache.org/releases/index.html#past-releases > > > > > > -----Original Message----- > > > From: Rob Godfrey [mailto:[email protected]] > > > Sent: Thursday, August 2, 2018 5:57 PM > > > To: [email protected] > > > Cc: Mears, David B <[email protected]>; Herren, Elaine < > > > [email protected]>; Rao, Shobha (NonStop) > > > <[email protected] > > > > > > > Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer > > > messages after 4 GB data transfer > > > > > > Hi Mike, > > > > > > On Fri, 3 Aug 2018 at 01:25, Dyslin, Mike <[email protected]> wrote: > > > > > > > This is my first submit to this email group. Hopefully this is > > > > the correct place to post this problem. > > > > > > > > > > > This is exactly the right place to post this problem. > > > > > > > > > > We are running a continuous stream of message (about 5K each) from > > > > producer to consumer over a single java broker queue at a rate of > > > > about 600 messages/second. Outbound message flow stops after > > > > transferring 4 GB of message data (about 770,000 messages in 25 > > > > minutes). The Web Management Console page for our consumer > > > > connection > > > shows the total "Outbound Bytes" > > > > growing steadily until it reaches 4.0 GB and stops with "Last I/O > time" > > > > unchanging thereafter. > > > > > > > > After outbound messages stop: > > > > Inbound messages continue on the producer connection (well past > > > > 4.0 > > > > GB) and are kept in the queue until they expire with a > > > > time-to-live value of 3 minutes. The queue grows until is > > > > stabilizes with a steady > > > > 600 m/s inbound, and 600 m/s expiring and being deleted from the > > > > queue (as expected). The Web Management Console shows that the > > > > consumer connection remains open and is a consumer on the queue, > > > > and the queue shows the connection as a consumer on the queue. > > > > > > > > If I run the exact same test replacing the Java Broker with a C++ > > > > broker (1.37.0), message flow continues well past the 4 GB barrier. > > > > I kept it running for about 17 hours reaching about 37 million > > > > messages, about 180 GB data transferred on the queue. > > > > > > > > Since the only difference seems to be the broker, this seems to > > > > point to a problem with the Java Broker, and not issues with our > > > > producer, consumer or network issues. Could there be some problem > > > > with our java broker configuration that would explain this behaviour? > > > > > > > > > > Unfortunately this sounds like it may be a bug in the Java Broker > > > :-( > > > > > > > > > > > > > > Has anyone out there experienced more than 4 GB of outbound data > > > > on a single java broker connection or queue? > > > > > > > > > > > Can you confirm which client you are using, and which version of > > > AMQP is in use (as you have identified I don't expect this to be a > > > client problem, but knowing the client will help us track down the > > > issue in the > > broker)? > > > > > > > > > > Any help would be appreciated. > > > > > > > > Other comments/observations: > > > > > > > > I do not know if the 4 GB barrier is associated with the > > > > connection and/or the queue because all our message traffic is > > > > over one consumer connection and one queue. I could determine > > > > this by changing our consumer code to spread message traffic over > > > > one connection and multiple > > > queues. > > > > > > > > > > > > > > > > > > > > We are using the heartbeat feature with a 5 minute timeout. Since > > > the > > > > connection stays open beyond the 5 minute timeout after the > > > > messages stop, I assume the heartbeat messages are still being > > > > sent between consumer and broker, indicating that the consumer and > > > > broker are able to communicate over the socket. It has been > > > > awhile since I have tested that the heartbeat feature is working > correctly. > > > > > > > > If I close the consumer connection from the Web Management > > > > Console, the broker deletes the queue (I believe) and our consumer > > > > detects the closed connection, establishes a new connection and > > > > new queue, and messages start flowing again until . . . we reach > > > > the 4 GB barrier and messages stop being delivered once again. > > > > > > > > We have run with the Java Broker on both Linux (RHEL 7.4) and > > > > proprietary NonStop POSIX platform with the same results. > > > > Unfortunately, the C++ broker is not yet an option on the NonStop > > > > POSIX platform where we require the broker to be. > > > > > > > > > > > Hopefully we can quickly track down the issue in the Java Broker and > > > push out a fix, > > > > > > > > > > Thanks, > > > > Mike > > > > > > > > > > > Apologies you've run into this issue, Rob > > > > > > > > > > -------------------------------------------------------------------- > > > > - To unsubscribe, e-mail: [email protected] For > > > > additional commands, e-mail: [email protected] > > > > > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
