On 04/29/2015 08:03 PM, Matt Broadstone wrote:
On Wed, Apr 29, 2015 at 3:01 PM, Matt Broadstone <[email protected]> wrote:

On Wed, Apr 29, 2015 at 2:55 PM, Gordon Sim <[email protected]> wrote:

On 04/29/2015 05:46 PM, Matt Broadstone wrote:

Hi,

I have a service using the C++ Messaging API which connects to a single
instance of qpidd (currently on the same machine), which seems to crash
out
with this exception every couple of days under moderate load:

qpidd[68257]: 2015-04-28 11:56:38 [Broker] error
qpid.192.168.2.225:5672-192.168.2.148:60492: resource-limit-exceeded:
Maximum depth exceeded on

b1386bee-a36c-449d-953f-c25f4842e76d_hive.guest.metadata_7bf9355b-524b-4853-89bd-1848366cd21f:
current=[count: 389438, size: 104857546], max=[size: 104857600]
(/build/buildd/qpid-cpp-0.28/src/qpid/broker/Queue.cpp:1575)

Using qpid-stat I don't see the queue depth ever increase from 0 (which I
gather is why the exception is thrown, from reading the code), however I
-do- notice that the "acquired" count is increasing with every message
with
no corresponding "release" (release count is always 0).


That's actually 'expected', in terms of the code. It only increments the
released count when a messages is released back to the queue, rather than
being acknowledged and dequeued. Also there is nothing at present that
decrements the acquired count, so it would be expected to keep going up.


Okay, good to know, just making sure I wasn't seeing a huge problem with
improperly handled messages here.


The exception above is indeed a result of the queue backing up,
apparently reaching a depth of 389438 messages. What address options if any
are used for the receiver consuming from that queue? Is there anything to
indicate whether that receiver was behaving normally just before the point
at which the error occurred?


I'm using no address options at all. The two programs I submitted earlier
(mqget/mqsend) are reduced examples of what we're using (except the
receiver in my case uses the "multiple receivers" Session.getNext().fetch()
etc). Aside from that it's very "vanilla" right now. AFAICT everything was
fine, until it wasn't. The original bug occurred with version 0.28, so
maybe there's an issue with the fact that it was still using the legacy
store? However, everything I see here indicates nothing ever touched the
disk (these are just messages being published to a topic). As for the
receiving side, each receiver (and this one in particular) are set to a
prefetch(capacity) of 10.

What seems particularly strange to me is that the backup is for hundreds
of thousands of messages, how could that even be possible? Right now we
have about 10 producers publishing every ~6 seconds.

Matt



Also, what is the recommended failover scenario for this situation?
Basically what happened for us is that this "situation" occurred, and then
were no longer receiving ANY messages on that receiver and it took our
whole system down. The "workaround" was to simply restart the qpidd process.

Did you restart the clients (do you use auto reconnect)? Did you run qpid-stat at the time the incident occurred? Did you try restarting the receiver before restarting qpidd?


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to