Håkan Johansson created QPID-7051:
-------------------------------------
Summary: Crash after reconnect with transactional session (with
patch)
Key: QPID-7051
URL: https://issues.apache.org/jira/browse/QPID-7051
Project: Qpid
Issue Type: Bug
Components: C++ Client
Affects Versions: qpid-cpp-0.34
Environment: Red Hat Enterprise Linux Server release 6.7 (Santiago)
The broker is ActiveMQ 5.13.0.
The protocol used in AMQP 1.0.
Reporter: Håkan Johansson
I have a test program (see the "consumer.cc" attachment) that creates a
connection with "reconnect" enabled.
It then creates a transactional session and a receiver to some queue from that
session.
It then reads all messages from the queue and prints out their content.
A sleep is used between each read to make the test possible.
While the broker is down the program will try to reconnect to it.
As soon as it succeeds with that the fetch call throws an exception because the
transaction has become invalid.
The exception is caught and the read loop is broken out of.
The test function then exits, causing the _Receiver_, _Session_, and
_Connection_ objects to be destructed.
The crash happens while destructing the _Connection_ object.
It took some digging, but I managed to find the reason for the crash.
When the _Connection_ object is destructed it automatically destructs its
_ConnectionHandle_ object, which in turn destructs its _ConnectionContext_
object. Nothing strange here.
The _ConnectionContext_ destructor makes a call to its own _close_ method,
which tries to shut down all its sessions.
The problem is that the session has been made invalid by the disconnect, which
causes the call to _syncLH_ to throw an exception,
which is not caught anywhere, indirectly causing the _ConnectionContext_
destructor to throw an exception. This is a big no-no in C++.
A side effect of this is that the transport object is not closed before it is
destructed,
which means that it is still listening for events. The crash happens when the
next pending event tries to use
the destructed transport object.
The solution, in my humble opinion, is to catch the exception throws by the
_syncLH_ call in the _ConnectionContext::close_ method.
This way we can try to close all sessions even if one or more of them are
invalidated for some reason.
The rest of the cleanup process will also be done properly.
How to run the test program:
* Compile both "producer.cc" and "consumer.cc". They both need to be linked to
the "qpidmessaging" library.
* Run "producer" once. This will add ten messages to the "apa.bepa" queue on
the broker.
* Start "consumer".
* When the consumer starts to print out the messages, shut down and restart the
broker.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]