[ 
https://issues.apache.org/jira/browse/AMQCPP-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Katherine Pully updated AMQCPP-753:
-----------------------------------
    Description: 
When an ActiveMQConnection fails before an ActiveMQSessionKernel can deliver a 
message acknowledgement to the broker, only decaf-type exceptions are caught. 
However, ActiveMQSessionKernel::acknowledge can indirectly throw a CMS-type 
exception, which means that the consumer read lock may not get released. This 
will result in a deadlock when the session tries to acquire the consumer write 
lock (for example, when cleaning up the ActiveMQConnection).

I have attached a stack trace from such a deadlock, which occurs when the 
ActiveMQConnection gets cleaned up. The relevant portion (edited for brevity 
and clarity, though the attached is the original stack), is:
{code:java}
0  decaf::util::concurrent::ExecutorKernel::Worker::run() 
ThreadPoolExecutor.cpp:184
1  decaf::util::concurrent::ExecutorKernel::runWorker 
(decaf::util::concurrent::ExecutorKernel::Worker*) ThreadPoolExecutor.cpp:738
2  activemq::core::OnExceptionRunnable::run() ActiveMQConnection.cpp:439
3  activemq::core::ActiveMQConnection::cleanup() ActiveMQConnection.cpp:839
4  activemq::core::kernels::ActiveMQSessionKernel::dispose() 
ActiveMQSessionKernel.cpp:371
5  decaf::util::concurrent::locks::AbstractQueuedSynchronizer::acquire(int) 
AbstractQueuedSynchronizer.cpp:1565
6  decaf::util::concurrent::locks::SynchronizerState::acquireQueued((anonymous 
namespace)::Node*, int) AbstractQueuedSynchronizer.cpp:711
7  decaf::util::concurrent::locks::LockSupport::park() LockSupport.cpp:54
8  decaf::internal::util::concurrent::Threading::park(decaf::lang::Thread*) 
Threading.cpp:1345
9  
decaf::internal::util::concurrent::PlatformThread::interruptibleWaitOnCondition(_opaque_pthread_cond_t*,
 _opaque_pthread_mutex_t*, 
decaf::internal::util::concurrent::CompletionCondition&) PlatformThread.cpp:210
10 _pthread_cond_wait
11 __psynch_cvwait{code}
This issue can be reproduced by using a client-acknowledge strategy and adding 
a substantial (10+ seconds) call to sleep before acknowledging the message. If 
the connection fails during that sleep, a CMS exception will be thrown when the 
thread handling onMessage wakes up and tries to acknowledge the message.

The exception is originally thrown by 
[ActiveMQConnection::checkClosedOrFailed|#L1329].] These exceptions become 
ActiveMQ exceptions [here|#L1257],] and then CMS Exceptions [here|#L1426].] The 
only exceptions caught in 
[ActiveMQSessionKernel::acknowledge|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L508]
 are [decaf 
exceptions|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L518];
 when a CMS exception is thrown, the [consumer read 
lock|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L510]
 is not released.

This issue can be fixed by catching the CMS exceptions in 
ActiveMQSessionKernel::acknowledge.

 

  was:
When an ActiveMQConnection fails before an ActiveMQSessionKernel can deliver a 
message acknowledgement to the broker, only decaf-type exceptions are caught. 
However, ActiveMQSessionKernel::acknowledge can indirectly throw a CMS-type 
exception, which means that the consumer read lock may not get released. This 
will result in a deadlock when the session tries to acquire the consumer write 
lock (for example, when cleaning up the ActiveMQConnection).

I have attached a stack trace from such a deadlock, which occurs when the 
ActiveMQConnection gets cleaned up. The relevant portion (edited for brevity 
and clarity, though the attached is the original stack), is:
{code:java}
0  decaf::util::concurrent::ExecutorKernel::Worker::run() 
ThreadPoolExecutor.cpp:184
1  decaf::util::concurrent::ExecutorKernel::runWorker 
(decaf::util::concurrent::ExecutorKernel::Worker*) ThreadPoolExecutor.cpp:738
2  activemq::core::OnExceptionRunnable::run() ActiveMQConnection.cpp:439
3  activemq::core::ActiveMQConnection::cleanup() ActiveMQConnection.cpp:839
4  activemq::core::kernels::ActiveMQSessionKernel::dispose() 
ActiveMQSessionKernel.cpp:371
5  decaf::util::concurrent::locks::AbstractQueuedSynchronizer::acquire(int) 
AbstractQueuedSynchronizer.cpp:1565
6  decaf::util::concurrent::locks::SynchronizerState::acquireQueued((anonymous 
namespace)::Node*, int) AbstractQueuedSynchronizer.cpp:711
7  decaf::util::concurrent::locks::LockSupport::park() LockSupport.cpp:54
8  decaf::internal::util::concurrent::Threading::park(decaf::lang::Thread*) 
Threading.cpp:1345
9  
decaf::internal::util::concurrent::PlatformThread::interruptibleWaitOnCondition(_opaque_pthread_cond_t*,
 _opaque_pthread_mutex_t*, 
decaf::internal::util::concurrent::CompletionCondition&) PlatformThread.cpp:210
10 _pthread_cond_wait
11 __psynch_cvwait{code}
This issue can be reproduced by using a client-acknowledge strategy and adding 
a substantial (10+ seconds) call to sleep before acknowledging the message. If 
the connection fails during that sleep, a CMS exception will be thrown when the 
thread handling onMessage wakes up and tries to acknowledge the message.

The exception is originally thrown by 
[ActiveMQConnection::checkClosedOrFailed|#L1329].] These exceptions become 
ActiveMQ exceptions [here|#L1257],] and then CMS Exceptions [here|#L1426].] The 
only exceptions caught in 
[ActiveMQSessionKernel::acknowledge|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L508]
 are [decaf 
exceptions|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L518];
 when a CMS exception is caught, the [consumer read 
lock|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L510]
 is not released.

This issue can be fixed by catching the CMS exceptions in 
ActiveMQSessionKernel::acknowledge.

 


> Deadlock when ActiveMQConection fails before ActiveMQSessionKernel can 
> deliver message acknowledgement
> ------------------------------------------------------------------------------------------------------
>
>                 Key: AMQCPP-753
>                 URL: https://issues.apache.org/jira/browse/AMQCPP-753
>             Project: ActiveMQ C++ Client
>          Issue Type: Bug
>          Components: Decaf
>    Affects Versions: 3.9.5
>         Environment: Unix/Linux
>            Reporter: Katherine Pully
>            Assignee: Timothy A. Bish
>            Priority: Major
>         Attachments: recreated_hang.txt
>
>
> When an ActiveMQConnection fails before an ActiveMQSessionKernel can deliver 
> a message acknowledgement to the broker, only decaf-type exceptions are 
> caught. However, ActiveMQSessionKernel::acknowledge can indirectly throw a 
> CMS-type exception, which means that the consumer read lock may not get 
> released. This will result in a deadlock when the session tries to acquire 
> the consumer write lock (for example, when cleaning up the 
> ActiveMQConnection).
> I have attached a stack trace from such a deadlock, which occurs when the 
> ActiveMQConnection gets cleaned up. The relevant portion (edited for brevity 
> and clarity, though the attached is the original stack), is:
> {code:java}
> 0  decaf::util::concurrent::ExecutorKernel::Worker::run() 
> ThreadPoolExecutor.cpp:184
> 1  decaf::util::concurrent::ExecutorKernel::runWorker 
> (decaf::util::concurrent::ExecutorKernel::Worker*) ThreadPoolExecutor.cpp:738
> 2  activemq::core::OnExceptionRunnable::run() ActiveMQConnection.cpp:439
> 3  activemq::core::ActiveMQConnection::cleanup() ActiveMQConnection.cpp:839
> 4  activemq::core::kernels::ActiveMQSessionKernel::dispose() 
> ActiveMQSessionKernel.cpp:371
> 5  decaf::util::concurrent::locks::AbstractQueuedSynchronizer::acquire(int) 
> AbstractQueuedSynchronizer.cpp:1565
> 6  
> decaf::util::concurrent::locks::SynchronizerState::acquireQueued((anonymous 
> namespace)::Node*, int) AbstractQueuedSynchronizer.cpp:711
> 7  decaf::util::concurrent::locks::LockSupport::park() LockSupport.cpp:54
> 8  decaf::internal::util::concurrent::Threading::park(decaf::lang::Thread*) 
> Threading.cpp:1345
> 9  
> decaf::internal::util::concurrent::PlatformThread::interruptibleWaitOnCondition(_opaque_pthread_cond_t*,
>  _opaque_pthread_mutex_t*, 
> decaf::internal::util::concurrent::CompletionCondition&) 
> PlatformThread.cpp:210
> 10 _pthread_cond_wait
> 11 __psynch_cvwait{code}
> This issue can be reproduced by using a client-acknowledge strategy and 
> adding a substantial (10+ seconds) call to sleep before acknowledging the 
> message. If the connection fails during that sleep, a CMS exception will be 
> thrown when the thread handling onMessage wakes up and tries to acknowledge 
> the message.
> The exception is originally thrown by 
> [ActiveMQConnection::checkClosedOrFailed|#L1329].] These exceptions become 
> ActiveMQ exceptions [here|#L1257],] and then CMS Exceptions [here|#L1426].] 
> The only exceptions caught in 
> [ActiveMQSessionKernel::acknowledge|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L508]
>  are [decaf 
> exceptions|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L518];
>  when a CMS exception is thrown, the [consumer read 
> lock|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L510]
>  is not released.
> This issue can be fixed by catching the CMS exceptions in 
> ActiveMQSessionKernel::acknowledge.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to