Sandy Pratt wrote:
I've been experimented with failover on a cluster of two brokers, and
I often see this log item when a broker fails:

2009-mar-04 17:11:19 debug Exception constructed: Attempted size
underflow on dequeue(21): size: max=104857600, current=0; count: unli
 mited; type=flow_to_disk (qpid/broker/QueuePolicy.cpp:54)

What does underflow mean here?

The policy maintains a running count of enqueued messages and the aggregate size. Underflow here means that more data was dequeued than was enqueued which indicates some logical error.

What version of the code are you using? M4 or latest from trunk?

The broker seems to have died:

[prat...@hsvrhm5 qpidd]$ sudo /sbin/service qpidd status qpidd dead
but pid file exists

The test I was running failed over to the other broker and completed
after a timeout expired.

A subsequent test immediately failed over to the other broker and
completed (which makes sense because qpid on the first broker was
probably dead before it started).

In a general sense, what are the steps required to recover from a
broker failure?  What I am looking for is step #3 below:

Assume a cluster of two brokers, A and B 1) A dies 2) clients fail
over to B 3) do something to recover A without interrupting clients
of B 4) A and B are again interchangeable

I've looked through the docs and haven't seen anything about this.
Apologies if I missed it.  I also tried simply restarting A, which
doesn't seem to work.

What errors/symptoms do you see when simply restarting A? (As Carl says, this _should_ be all that is required).



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to