Chris A. Evans created AMQCPP-685:
-------------------------------------
Summary: Recurring Segfault bubbles up through
ActiveMQProducer::send(); possibly related to FailoverTransport
Key: AMQCPP-685
URL: https://issues.apache.org/jira/browse/AMQCPP-685
Project: ActiveMQ C++ Client
Issue Type: Bug
Components: CMS Impl
Affects Versions: 3.9.4
Environment: * Virtual Machine on VMware ESXi 6.7
* RHEL 7.9
* Kernel 3.10.0-1160.36.2.el7.x86_64
* ActiveMQ-CPP 3.9.4 manually compiled 32-bit from source
Reporter: Chris A. Evans
Assignee: Timothy A. Bish
We have encountered a regularly occurring issue when using the activemq-cpp
3.9.4. Our application is acting as a producer to a message queue. We connect
to an ActiveMQ 5.16.0 broker server using the failover transport.
All other potential configurations have not been exonerated of this issues
because, while it occurs regularly, we have yet to determine the pattern to be
able to reliably reproduce it.
I am a bit out of my element here, so let me know if any additional context or
supporting material is needed here to assist.
The following internal code of ours is invoking
{{activemq::core::ActiveMQProducer::send():}}
bool AMQueue::send(Message* message) { try { if(_producer) {
_producer->send(message); return true; } } catch
(CMSException& e) { log4cxx::LoggerPtr logger =
log4cxx::Logger::getLogger("ActiveMQ"); LOG4CXX_ERROR(logger,
e.getMessage()); } return false;}
This code works just fine 99+% of the time. Eventually, however, our
application will segfault. Since the exception bubbles up from decaf, our
CMSException catch doesn't stop it.
The following stack trace is always present in the core dump:
{code:java}
(gdb) bt #0 0xf76fd430 in __kernel_vsyscall () #1 0xf1ace257 in raise () from
/lib/libc.so.6 #2 0xf49ed113 in ?? () #3 0xf5337645 in ?? () #4 0xf49ed9f6 in
?? () #5 <signal handler called> #6 0xf1a8328b in apr_pvsprintf () from
/usr/local/apr/lib/libapr-1.so.0 #7 0xf7267679 in
decaf::lang::Exception::buildMessage(char const*, char*&) () from
/usr/local/lib/libactivemq-cpp.so.19 #8 0xf72c21c5 in
decaf::util::NoSuchElementException::NoSuchElementException(char const*, int,
char const*, ...) () from /usr/local/lib/libactivemq-cpp.so.19 #9 0xf70d2512 in
decaf::util::HashMap<unsigned int,
decaf::lang::Pointer<activemq::transport::FutureResponse,
decaf::util::concurrent::atomic::AtomicRefCounter>,
decaf::util::HashCode<unsigned int> >::remove(unsigned int const&) () from
/usr/local/lib/libactivemq-cpp.so.19 #10 0xf70cfbc0 in (anonymous
namespace)::ResponseFinalizer::~ResponseFinalizer() () from
/usr/local/lib/libactivemq-cpp.so.19 #11 0xf70d0d69 in
activemq::transport::correlator::ResponseCorrelator::request(decaf::lang::Pointer<activemq::commands::Command,
decaf::util::concurrent::atomic::AtomicRefCounter>) () from
/usr/local/lib/libactivemq-cpp.so.19 #12 0xf6eb374e in
activemq::core::ActiveMQConnection::syncRequest(decaf::lang::Pointer<activemq::commands::Command,
decaf::util::concurrent::atomic::AtomicRefCounter>, unsigned int) () from
/usr/local/lib/libactivemq-cpp.so.19 #13 0xf6eb3c8c in
activemq::core::ActiveMQConnection::asyncRequest(decaf::lang::Pointer<activemq::commands::Command,
decaf::util::concurrent::atomic::AtomicRefCounter>, cms::AsyncCallback*) ()
from /usr/local/lib/libactivemq-cpp.so.19 #14 0xf6ff2d7b in
activemq::core::kernels::ActiveMQSessionKernel::send(activemq::core::kernels::ActiveMQProducerKernel*,
decaf::lang::Pointer<activemq::commands::ActiveMQDestination,
decaf::util::concurrent::atomic::AtomicRefCounter>, cms::Message*, int, int,
long long, activemq::util::MemoryUsage*, long long, cms::AsyncCallback*) ()
from /usr/local/lib/libactivemq-cpp.so.19 #15 0xf6fd733d in
activemq::core::kernels::ActiveMQProducerKernel::send(cms::Destination const*,
cms::Message*, int, int, long long, cms::AsyncCallback*) () from
/usr/local/lib/libactivemq-cpp.so.19 #16 0xf6fcfe54 in
activemq::core::kernels::ActiveMQProducerKernel::send(cms::Message*) () from
/usr/local/lib/libactivemq-cpp.so.19 #17 0xf6f3c186 in
activemq::core::ActiveMQProducer::send(cms::Message*) () from
/usr/local/lib/libactivemq-cpp.so.19
{code}
Our only theory at this time is that this may eventually occur after producing
a very high number of messages, such as 8+ million produced messages. This type
of crash has never happened early in the lifetime of the application run.
The title of this issue suggests a possible relation to the FailoverTransport.
I came to this conclusion because we haven't had this issue during the entire
lifetime of our app – it is a recent-ish phenomenon, which loosely matches up
with us switching from the tcp:// URI syntax to failover://.
I also noticed that the HashMap in the {{ResponseFinalizer}} object is placed
there from a {{&this->impl->requestMap}} call in
{{ResponseCorrelator::request}}. A quick search of the repo leads to to notice
that requestMap is only present in FailoverTransport.cpp (unless I missed it
elsewhere, which is possible).
The NoSuchElementException exception is thrown once the {{ResponseFinalizer}}
deconstructor tries to call {{map->remove(commandId);.}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)