[
https://issues.apache.org/jira/browse/AMQCPP-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Katherine Pully updated AMQCPP-753:
-----------------------------------
Summary: Deadlock when ActiveMQConection fails before ActiveMQSessionKernel
can deliver message acknowledgement (was: Deadlock when ActiveMQConection
fails before message acknowledgement can be delivered)
> Deadlock when ActiveMQConection fails before ActiveMQSessionKernel can
> deliver message acknowledgement
> ------------------------------------------------------------------------------------------------------
>
> Key: AMQCPP-753
> URL: https://issues.apache.org/jira/browse/AMQCPP-753
> Project: ActiveMQ C++ Client
> Issue Type: Bug
> Components: Decaf
> Affects Versions: 3.9.5
> Environment: Unix/Linux
> Reporter: Katherine Pully
> Assignee: Timothy A. Bish
> Priority: Major
> Attachments: recreated_hang.txt
>
>
> When an ActiveMQConnection fails before an ActiveMQSessionKernel can deliver
> a message acknowledgement to the broker, only decaf-type exceptions are
> caught. However, ActiveMQSessionKernel::acknowledge can indirectly throw a
> CMS-type exception, which means that the consumer read lock may not get
> released. This will result in a deadlock when the session tries to acquire
> the consumer write lock (for example, when cleaning up the
> ActiveMQConnection).
> I have attached a stack trace from such a deadlock, which occurs when the
> ActiveMQConnection gets cleaned up. The relevant portion (edited for brevity
> and clarity, though the attached is the original stack), is:
> {code:java}
> 0 decaf::util::concurrent::ExecutorKernel::Worker::run()
> ThreadPoolExecutor.cpp:184
> 1 decaf::util::concurrent::ExecutorKernel::runWorker
> (decaf::util::concurrent::ExecutorKernel::Worker*) ThreadPoolExecutor.cpp:738
> 2 activemq::core::OnExceptionRunnable::run() ActiveMQConnection.cpp:439
> 3 activemq::core::ActiveMQConnection::cleanup() ActiveMQConnection.cpp:839
> 4 activemq::core::kernels::ActiveMQSessionKernel::dispose()
> ActiveMQSessionKernel.cpp:371
> 5 decaf::util::concurrent::locks::AbstractQueuedSynchronizer::acquire(int)
> AbstractQueuedSynchronizer.cpp:1565
> 6
> decaf::util::concurrent::locks::SynchronizerState::acquireQueued((anonymous
> namespace)::Node*, int) AbstractQueuedSynchronizer.cpp:711
> 7 decaf::util::concurrent::locks::LockSupport::park() LockSupport.cpp:54
> 8 decaf::internal::util::concurrent::Threading::park(decaf::lang::Thread*)
> Threading.cpp:1345
> 9
> decaf::internal::util::concurrent::PlatformThread::interruptibleWaitOnCondition(_opaque_pthread_cond_t*,
> _opaque_pthread_mutex_t*,
> decaf::internal::util::concurrent::CompletionCondition&)
> PlatformThread.cpp:210
> 10 _pthread_cond_wait
> 11 __psynch_cvwait{code}
> This issue can be reproduced by using a client-acknowledge strategy and
> adding a substantial (10+ seconds) call to sleep before acknowledging the
> message, and then breaking the connection during that sleep.
> The exception is originally thrown by
> [ActiveMQConnection::checkClosedOrFailed|#L1329].] These exceptions become
> ActiveMQ exceptions [here|#L1257],] and then to CMS Exceptions
> [here|#L1426].] The only exceptions caught in
> [ActiveMQSessionKernel::acknowledge|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L508]
> are [decaf
> exceptions|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L518];
> when a CMS exception is caught, the [consumer read
> lock|https://github.com/apache/activemq-cpp/blob/master/activemq-cpp/src/main/activemq/core/kernels/ActiveMQSessionKernel.cpp#L510]
> is not released.
> This issue can be fixed by catching the CMS exceptions in
> ActiveMQSessionKernel::acknowledge.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)