[ 
https://issues.apache.org/jira/browse/QPID-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285701#comment-16285701
 ] 

Håkan Johansson commented on QPID-8056:
---------------------------------------

I have done some more digging.
The issue that the patch does not fix is that the Poller thread calls callbacks 
in {{TcpTransport}} objects even after they are deleted.
This can happen if you have more than one connection. In that case the poller 
thread is not deleted when the first connection is closed, but the callbacks 
are still registered.

My patch only fixes the case where there is only a single connection.

> qpid::messaging::ConnectionContext crash after network disconnect (with patch)
> ------------------------------------------------------------------------------
>
>                 Key: QPID-8056
>                 URL: https://issues.apache.org/jira/browse/QPID-8056
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>    Affects Versions: qpid-cpp-1.36.0
>         Environment: RedHat Enterprise Linux 6
>            Reporter: Håkan Johansson
>         Attachments: connection_context.diff, valgrind.txt
>
>
> When doing HA testing we found that our application often crashed inside the 
> Qpid Messaging library.
> Our test:
> * One ActiveMQ broker.
> * Two proxies connecting to the AMQP port on the broker. At the start, only 
> one of the proxies are running.
> * Test program configured to use failover between the two proxies. Protocol 
> is "amqp1.0". It reads messages in a loop using a transactional session. On 
> error it closes the connection and opens a new.
> * Send some messages and let the test program process them.
> * Stop proxy1 and start proxy2.
> * Send some more messages and let the test program process them.
> * Stop proxy2 and start proxy1.
> * And so on...
> After a couple of switches the test program crashes, but not always. It's a 
> timing thing.
> A typical error message that we see before the crash:
> {noformat}
> Exception when trying to close the qpid connection: Transaction outcome 
> unknown: transport failure
> {noformat}
> The reason for the crash is that the poller thread is still active when the 
> connection is being deleted. The destructor of the 
> {{qpid::messaging::ConnectionContext}} class deletes the {{TcpTransport}} 
> instance at the same time as, or right before, the poller thread is calling a 
> callback on it ({{qpid::messaging::amqp::TcpTransport::disconnected}}).
> I have attached a patch to solve the issue, at least for this use case.
> I cannot test this on {{1.37.0}} as I cannot build that version on RHEL6 as 
> it uses Python 2.6 which is no longer supported in {{1.37.0}}. The code in 
> question is identical in {{1.36.0}} and {{1.37.0}} though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to