Hi Mike,
On Mon, 6 Aug 2018 at 18:23, Dyslin, Mike <[email protected]> wrote:

> Rob,
>
> I created JIRA QPID-8225 with Title "Java Broker (7.0.6) stops delivering
> queue/consumer messages after 4 GB data transfer".
>
> The log you requested is attached to the JIRA.
>
> The log contains the connection startup, 10 messages flowing (2 every 10
> seconds), then connection close.  It does not contain when message flow
> stopped (credit exhausted?).  I can get that for you if you like.
>
> We did see this in the log:
>
> 2018-08-06 07:57:34,100 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1
> MessageSetFlowMode(destination=dsaqp15, flowMode=CREDIT)
> 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1
> MessageFlow(destination=dsaqp15, unit=MESSAGE, value=4294967295)
> 2018-08-06 07:57:34,102 DEBUG [IO-/15.252.32.148:43266]
> (o.a.q.s.p.v.ServerConnection) - RECV: [conn:6cd338b5] ch=1
> MessageFlow(destination=dsaqp15, unit=BYTE, value=4294967295)
>
> And we never saw anything that looked like credit was being reset by the
> consumer.
>
>
OK - this actually  matches up with what I suspected looking more deeply at
the code this morning.  The value 4294967295 is supposed to be a magic
number meaning "infinite" credit, however the Java Broker is mishandling
this, expecting it to be passed as -1, whereas in fact a lower layer in the
code is instead passing the value as 4294967295 in a signed 64bit value.  I
just committed a fix to master, but I'm not sure when that might be
released as a new version.


> So, perhaps the 4 GB credit simply is all used up and messages stop
> flowing.
>
> How do we change FlowMode to OFF, or if that is not possible, how do we
> reset our credit back to 4 GB in the QPID C++ API?
>
>
So, I'm not familiar with the C++ API, if there were an obvious way to
switch from "credit" mode to "window" mode (which is the protocol default)
then that would likely work around the problem.


> Since we don't see any of our code setting any kind of flow control
> option, is the C++ API sending flow control default setting to the broker,
> or is the broker setting flow control default because the producer has not
> specified any flow control?  If the latter, then perhaps the C++ and Java
> brokers have different flow control defaults.  If the former, then C++
> broker and the java broker are not implementing specified the flow control
> in the same way.
>
> Again, our original code was only tested with C++ brokers.
>
>
Yeah - sadly this is a Java Broker bug you've discovered.  I've uploaded a
jar file with the proposed fix into the JIRA (just replace the existing jar
of the same name in lib/broker-plugins).  Alternatively you can build this
yourself by doing the following:

git clone https://git-wip-us.apache.org/repos/asf/qpid-broker-j.git

cd qpid-broker-j/

git checkout 7.0.6

git cherry-pick cf40fdea39d9633702ee286d94e950a19ec7be74
mvn package

 Apologies you've run into this :-(  I think most clients are using window,
or non-infinite credit - that's the only reason I can think that this has
not been spotted before.

If you give the jar a try and it works (or doesn't work) for you, please
let us know.

-- Rob

Thanks,
> Mike
>
>
> -----Original Message-----
> From: Rob Godfrey [mailto:[email protected]]
> Sent: Friday, August 3, 2018 3:34 PM
> To: [email protected]
> Cc: Mears, David B <[email protected]>; Herren, Elaine <
> [email protected]>; Rao, Shobha (NonStop) <[email protected]
> >
> Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer messages
> after 4 GB data transfer
>
> On Sat, 4 Aug 2018 at 00:11, Dyslin, Mike <[email protected]> wrote:
>
> > Rob,
> >
> > We do not know how our consumer is managing credit, or anything about
> > credit flow, but have started looking for information on this.  It
> > sounds like the right place to look.  If we could figure out how to
> > turn off the flow control limits, that may do it.  We inherited most
> > of this code, and the original authors of the code are long gone, so
> > we don't have the experience of putting it all together.  Perhaps the
> > defaults for this credit flow is different between the C++ and Java
> brokers.
> >
> > OK, we'll try to figure out how to get protocol logging turned on and
> > get a log file to look at.
> >
>
> Thanks - I had a quick look at the Java Broker code for managing credit
> and didn't immediately see any obvious errors that would hit a properly
> functioning client, *however* it does look like that if the client is
> improperly managing credit there would be an overflow error where a value
> that should be an unsigned integral value (the amount of remaining credit)
> will turn negative.  The broker is storing the amount of outstanding credit
> as a (signed) Java long value.  AMQP 0-10 defines the message.flow command
> as "This command controls the flow of message data to a given destination.
> It is used by the recipient of messages to dynamically match the incoming
> rate of message flow to its processing or forwarding capacity. Upon receipt
> of this command, the sender must add "value" number of the specified unit
> to the available credit balance for the specified destination."  So if the
> client is repeatedly adding more credit than it needs the stored "limit" in
> the broker might overflow and turn negative.  At this point the broker
> would stop sending messages.
>
> -- Rob
>
>
>
> >
> > Thanks,
> > Mike
> >
> > -----Original Message-----
> > From: Rob Godfrey [mailto:[email protected]]
> > Sent: Friday, August 3, 2018 11:11 AM
> > To: [email protected]
> > Cc: Mears, David B <[email protected]>; Herren, Elaine <
> > [email protected]>; Rao, Shobha (NonStop)
> > <[email protected]
> > >
> > Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer
> > messages after 4 GB data transfer
> >
> > On Fri, 3 Aug 2018 at 19:18, Dyslin, Mike <[email protected]> wrote:
> >
> > > Rob,
> > >
> > > Our consumer client version is 1.37:
> > >
> > >    bash 0 3: rpm -qa | grep -i qpid
> > >    qpid-cpp-client-devel-1.37.0-1.el7.x86_64
> > >    python-qpid-1.37.0-1.el7.noarch
> > >    qpid-qmf-1.37.0-1.el7.x86_64
> > >    qpid-cpp-client-1.37.0-1.el7.x86_64
> > >    python-qpid-qmf-1.37.0-1.el7.x86_64
> > >    qpid-cpp-server-1.37.0-1.el7.x86_64
> > >    qpid-proton-c-0.18.1-1.el7.x86_64
> > >    qpid-tools-1.37.0-1.el7.noarch
> > >
> > > Our producer client version is 6.1.5:
> > >     -rw-r--r--    1 NSDA.NSDA          NSDA      570873 Jul 26 15:42
> > > qpid-client-6.1.5.jar
> > >     -rw-r--r--    1 NSDA.NSDA          NSDA      864493 Jul 26 15:42
> > > qpid-common-6.1.5.jar
> > >
> > > Both producer and consumer clients use AMQP_0_10 protocol.
> > >
> >
> > OK - so my assumption here is that there is some issue in the
> > management of credit in the broker (or possibly in the client).  AMQP
> > 0-10 has two distinct credit flow modes "credit" and "window" and also
> > allows the consumer to separately set limits for both "message" and
> "byte" credit.
> > I'm not very familiar with the C++ API, but do you know how your
> > consumer is managing credit?
> >
> > One thing that would be very helpful in trying to diagnose this
> > problem is getting protocol logging for the consumer (at least for the
> > start of the consumer where it sets up the credit flow mode, and
> > towards the end where it would be interesting to see the credit being
> > allocated just before message flow stops).
> >
> > -- Rob
> >
> >
> > >
> > > Thanks for your assistance,
> > > Mike
> > >
> > > FYI - There may be a typo on the past releases web page.  I believe
> > "2017"
> > > should be "2018" in "Qpid JMS AMQP 0-x 6.3.2, July 2017".  URL:
> > > https://qpid.apache.org/releases/index.html#past-releases
> > >
> > > -----Original Message-----
> > > From: Rob Godfrey [mailto:[email protected]]
> > > Sent: Thursday, August 2, 2018 5:57 PM
> > > To: [email protected]
> > > Cc: Mears, David B <[email protected]>; Herren, Elaine <
> > > [email protected]>; Rao, Shobha (NonStop)
> > > <[email protected]
> > > >
> > > Subject: Re: Java Broker (7.0.6) stops delivering queue/consumer
> > > messages after 4 GB data transfer
> > >
> > > Hi Mike,
> > >
> > > On Fri, 3 Aug 2018 at 01:25, Dyslin, Mike <[email protected]> wrote:
> > >
> > > > This is my first submit to this email group.  Hopefully this is
> > > > the correct place to post this problem.
> > > >
> > > >
> > > This is exactly the right place to post this problem.
> > >
> > >
> > > > We are running a continuous stream of message (about 5K each) from
> > > > producer to consumer over a single java broker queue at a rate of
> > > > about 600 messages/second.  Outbound message flow stops after
> > > > transferring 4 GB of message data (about 770,000 messages in 25
> > > > minutes).  The Web Management Console page for our consumer
> > > > connection
> > > shows the total "Outbound Bytes"
> > > > growing steadily until it reaches 4.0 GB and stops with "Last I/O
> time"
> > > > unchanging thereafter.
> > > >
> > > > After outbound messages stop:
> > > > Inbound messages continue on the producer connection (well past
> > > > 4.0
> > > > GB) and are kept in the queue until they expire with a
> > > > time-to-live value of 3 minutes.  The queue grows until is
> > > > stabilizes with a steady
> > > > 600 m/s inbound, and 600 m/s expiring and being deleted from the
> > > > queue (as expected).  The Web Management Console shows that the
> > > > consumer connection remains open and is a consumer on the queue,
> > > > and the queue shows the connection as a consumer on the queue.
> > > >
> > > > If I run the exact same test replacing the Java Broker with a C++
> > > > broker (1.37.0), message flow continues well past the 4 GB barrier.
> > > > I kept it running for about 17 hours reaching about 37 million
> > > > messages, about 180 GB data transferred on the queue.
> > > >
> > > > Since the only difference seems to be the broker, this seems to
> > > > point to a problem with the Java Broker, and not issues with our
> > > > producer, consumer or network issues.  Could there be some problem
> > > > with our java broker configuration that would explain this behaviour?
> > > >
> > >
> > > Unfortunately this sounds like it may be a bug in the Java Broker
> > > :-(
> > >
> > >
> > > >
> > > > Has anyone out there experienced more than 4 GB of outbound data
> > > > on a single java broker connection or queue?
> > > >
> > > >
> > > Can you confirm which client you are using, and which version of
> > > AMQP is in use (as you have identified I don't expect this to be a
> > > client problem, but knowing the client will help us track down the
> > > issue in the
> > broker)?
> > >
> > >
> > > > Any help would be appreciated.
> > > >
> > > > Other comments/observations:
> > > >
> > > > I do not know if the 4 GB barrier is associated with the
> > > > connection and/or the queue because all our message traffic is
> > > > over one consumer connection and one queue.  I could determine
> > > > this by changing our consumer code to spread message traffic over
> > > > one connection and multiple
> > > queues.
> > >
> > >
> > >
> > > >
> > > >
> > > We are using the heartbeat feature with a 5 minute timeout.  Since
> > > the
> > > > connection stays open beyond the 5 minute timeout after the
> > > > messages stop, I assume the heartbeat messages are still being
> > > > sent between consumer and broker, indicating that the consumer and
> > > > broker are able to communicate over the socket.  It has been
> > > > awhile since I have tested that the heartbeat feature is working
> correctly.
> > > >
> > > > If I close the consumer connection from the Web Management
> > > > Console, the broker deletes the queue (I believe) and our consumer
> > > > detects the closed connection, establishes a new connection and
> > > > new queue, and messages start flowing again until . . . we reach
> > > > the 4 GB barrier and messages stop being delivered once again.
> > > >
> > > > We have run with the Java Broker on both Linux (RHEL 7.4) and
> > > > proprietary NonStop POSIX platform with the same results.
> > > > Unfortunately, the C++ broker is not yet an option on the NonStop
> > > > POSIX platform where we require the broker to be.
> > > >
> > > >
> > > Hopefully we can quickly track down the issue in the Java Broker and
> > > push out a fix,
> > >
> > >
> > > > Thanks,
> > > > Mike
> > > >
> > > >
> > > Apologies you've run into this issue, Rob
> > >
> > >
> > > > --------------------------------------------------------------------
> > > > - To unsubscribe, e-mail: [email protected] For
> > > > additional commands, e-mail: [email protected]
> > > >
> > > >
> > >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to