On 01/03/2012 08:18 PM, Rob Springer wrote:
Hi all,
In our application (we've tried both 0.5 and 0.12), we'd like for our
client programs to quickly recover in the case where a broker dies.
Currently, we're able to do this by dynamically allocating all our
Qpid-using code, and simply re-allocating should the broker die, but
that's seems inelegant and feels...wrong.
If we attempt to reconnect and don't create a new Session (i.e., use the
old one), bad things happen (since Session doesn't yet support resume(),
I assume that's expected behavior).
When we then try to create a new Session, a new SubscriptionManager, and
a new Subscription, we get an assertion failure (backtrace at the end of
this message).
After reading the backtrace, I believe the following is happening:
1) In recovery, we attempt to assign a new Subscription to the previous
Subscription variable (i.e., "sub = subMgr->subscribe()")
2) That causes the refcount for the old Subscription to fall to 0,
causing it to be cleaned up.
3) As part of that cleanup, the associated SubscriptionImpl object
goes to destroy its (std::auto_ptr<ScopedDivert>) demuxRule member.
4) That demuxRule member maintains a reference to a Demux object,
demuxer, which exists inside the Session object.

Thus, we have a fatal circle - we need to create a new Session object to
be able to proceed, but when we do so, we render ourselves unable to
re-use Subscription variables.

Unfortunately, I can't think of an easy/simple fix, besides perhaps
adding reference counting to the Demux variable...although I haven't
thought that through at all.

As a workaround, can you first assign a 'null' Subscription to the subscription variable and only then recreate the Session and SubscriptionManager, then finally reassign the variable with the real Subscription?

For an actual fix, perhaps a destructor in SubscriptionManagerImpl that calls cancelDiversion() on all its Subscription instances would suffice(?).

I was wondering if you were aware of this sort of issue, and if so, if
there were plans to resolve it or ideas on how to resolve it.

I wasn't aware of this specific issue. We've been encouraging people to use the newer messaging API instead of this older client API. The messaging API offers a cleaner, higher level abstraction that makes migration to newer versions of the protocol simpler and also makes it simpler to provide richer functionality behind the API (such as auto-reconnect).


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to