Gordon put a fix in for isBound()

http://svn.apache.org/viewvc?rev=790164&view=rev


I'll look at the core, but you can also test with a rev later than the above and see if that works for you

Carl.



rspringer wrote:
Carl - Sorry for the delay, I was out of the office for most of last week.  I
have what I THINK is an accurate core (my binary has been re-created since I
generated it, but looking over it, the relevant sections appear accurate)
and the associated dumps.  If you'd like / need, let me know and I can
revert my changes and re-generate.

Below are the backtraces - thanks for checking this out (thread 1 is the
interesting one)!
-Rob

#0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
/usr/lib64/libstdc++.so.6 (gdb) thread apply all bt

Thread 8 (process 20504):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432 #2 0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a9573cd05 in qpid::broker::Broker::run (this=0x534160) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Broker.cpp:319 #5 0x00000000004102fb in QpiddBroker::execute (this=0x7fbffff3b7,
options=0x52bfa0) at ../../../src/qpid-0.5/cpp/src/posix/QpiddBroker.cpp:165
#6  0x000000000040d9af in main (argc=3, argv=0x7fbffff6f8) at
../../../src/qpid-0.5/cpp/src/qpidd.cpp:77
Thread 7 (process 20505):
#0  0x0000002a9575cd76 in
boost::intrusive_ptr<qpid::broker::Message>::operator-> (this=0x409efe80)
at /home/rspringer/opt/include/boost/smart_ptr/intrusive_ptr.hpp:149 #1 0x0000002a95814ff1 in qpid::management::ManagementBroker::sendBuffer
(this=0x2a96741010, b...@0x409eff50, length=103, exchange=
{px = 0x538a20, pn = {pi_ = 0x538ce0}}, routingKey= {static npos = 18446744073709551615, _M_dataplus =
{<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data
fields>}, <No data fields>}, _M_p = 0x2a976ada38
"console.obj.1.0.org.apache.qpid.broker.binding"}}) at ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:304 #2 0x0000002a95815a20 in qpid::management::ManagementBroker::periodicProcessing (this=0x2a96741010) at ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:388 #3 0x0000002a95814b21 in Periodic (this=0x0, _brok...@0x2a96741148, _seconds=42) at ../../../src/qpid-0.5/cpp/src/qpid/management/ManagementBroker.cpp:245 #4 0x0000002a95804f74 in qpid::broker::Timer::run (this=0x2a96741140) at ../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:67 #5 0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable (p=0x2a96741140) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35 #6 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #7 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #8 0x0000000000000000 in ?? ()
Thread 6 (process 20506):
#0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/tls/libpthread.so.0
#1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534558,
mut...@0x534530, absoluteti...@0x2a97ea5048)
at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69 #2 0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x534530, absoluteti...@0x2a97ea5048) at ../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45 #3 0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x534528) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x534528) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#5 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #6 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #7 0x0000000000000000 in ?? ()
Thread 5 (process 20507):
#0  0x0000003eb8c08d2f in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/tls/libpthread.so.0
#1  0x0000002a958053d7 in qpid::sys::Condition::wait (this=0x534620,
mut...@0x5345f8, absoluteti...@0x538d88)
at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Condition.h:69 #2 0x0000002a9580538f in qpid::sys::Monitor::wait (this=0x5345f8,
absoluteti...@0x538d88) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Monitor.h:45
#3  0x0000002a95804fb2 in qpid::broker::Timer::run (this=0x5345f0) at
../../../src/qpid-0.5/cpp/src/qpid/broker/Timer.cpp:69 #4 0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x5345f0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#5 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #6 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #7 0x0000000000000000 in ?? ()
Thread 4 (process 20508):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432 #2 0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35 #5 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #6 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #7 0x0000000000000000 in ?? ()
Thread 3 (process 20509):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432 #2 0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35 #5 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #6 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #7 0x0000000000000000 in ?? ()
Thread 2 (process 20511):
#0  0x0000003eb85c9c5c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000002a95c56b67 in qpid::sys::Poller::wait (this=0x533250,
timeout={nanosecs = 9223372036854775807})
at ../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:432 #2 0x0000002a95c56a07 in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:398
#3  0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#4  0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0) at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35 #5 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0 #6 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6 #7 0x0000000000000000 in ?? ()
Thread 1 (process 20510):
#0  0x0000003ebb25c213 in std::_Rb_tree_increment () from
/usr/lib64/libstdc++.so.6
#1  0x0000002a9580a583 in
std::_Rb_tree_iterator<std::pair<qpid::broker::TopicPattern const,
qpid::broker::TopicExchange::BoundKey> >::operator++
    (this=0x2a9580a583) at
/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../include/c++/3.4.6/bits/stl_tree.h:187 #2 0x0000002a95808d58 in qpid::broker::TopicExchange::isBound (this=0x537ee0, queue={px = 0x2a979719f0, pn = {pi_ = 0x2a97971a18}}, routingKey=0x2a9791eb90) at ../../../src/qpid-0.5/cpp/src/qpid/broker/TopicExchange.cpp:297 #3 0x0000002a957eb4fd in
qpid::broker::SessionAdapter::ExchangeHandlerImpl::bound (this=0xbe90f8,
exchangena...@0x2a9791eb80, queuena...@0x2a9791eb88, k...@0x2a9791eb90, ar...@0x2a9791eb98) at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionAdapter.cpp:256 #4 0x0000002a95c2f6ee in
qpid::framing::ExchangeBoundBody::invoke<qpid::framing::AMQP_ServerOperations::ExchangeHandler>
(this=0x2a9791eb70, invocab...@0xbe90f8) at gen/qpid/framing/ExchangeBoundBody.h:88 #5 0x0000002a95c2dce3 in
qpid::framing::AMQP_ServerOperations::ExchangeHandler::Invoker::visit
(this=0x43c03bc0, bo...@0x2a9791eb70) at gen/qpid/framing/ServerInvoker.cpp:650 #6 0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
(this=0x2a9791eb70, v...@0x43c03bc0) at
gen/qpid/framing/ExchangeBoundBody.h:92 #7 0x0000002a95c2c685 in
qpid::framing::AMQP_ServerOperations::Invoker::visit (this=0x43c03c40,
bo...@0x2a9791eb70) at gen/qpid/framing/ServerInvoker.cpp:363 #8 0x0000002a95c0e278 in qpid::framing::ExchangeBoundBody::accept
(this=0x2a9791eb70, v...@0x43c03c40) at
gen/qpid/framing/ExchangeBoundBody.h:92 #9 0x0000002a957fc3dc in
qpid::framing::invoke<qpid::broker::SessionAdapter> (targ...@0xbe90e0,
bo...@0x2a9791eb70) at ../../../src/qpid-0.5/cpp/src/qpid/framing/Invoker.h:67 #10 0x0000002a957f9a78 in qpid::broker::SessionState::handleCommand (this=0xbe8db0, method=0x2a9791eb70, i...@0x43c03ec0) at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:189 #11 0x0000002a957faaf7 in qpid::broker::SessionState::handleIn (this=0xbe8db0, fra...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/broker/SessionState.cpp:323 #12 0x0000002a957fda37 in
qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
&(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
( this=0xbe8f40, t...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67 #13 0x0000002a95c7ca57 in qpid::amqp_0_10::SessionHandler::handleIn (this=0xd55f20, f...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:86 #14 0x0000002a957fda37 in
qpid::framing::Handler<qpid::framing::AMQFrame&>::MemFunRef<qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface,
&(qpid::framing::Handler<qpid::framing::AMQFrame&>::InOutHandlerInterface::handleIn(qpid::framing::AMQFrame&))>::handle
( this=0xd55f30, t...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:67 #15 0x0000002a9577a10e in
qpid::framing::Handler<qpid::framing::AMQFrame&>::operator() (this=0xd55f30,
t...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/framing/Handler.h:42 #16 0x0000002a95777854 in qpid::broker::Connection::received (this=0xca54d0, fra...@0x43c04420) at ../../../src/qpid-0.5/cpp/src/qpid/broker/Connection.cpp:106 #17 0x0000002a95733199 in qpid::amqp_0_10::Connection::decode (this=0xb9a170, buffer=0xc61eb0 "\017\001", size=166) at ../../../src/qpid-0.5/cpp/src/qpid/amqp_0_10/Connection.cpp:55 #18 0x0000002a957d9ecc in qpid::broker::SecureConnection::decode (this=0xd33100, buffer=0xc61eb0 "\017\001", size=166) at ../../../src/qpid-0.5/cpp/src/qpid/broker/SecureConnection.cpp:42 #19 0x0000002a95cb0a7d in qpid::sys::AsynchIOHandler::readbuff (this=0xc9e520, buff=0xbee030) at ../../../src/qpid-0.5/cpp/src/qpid/sys/AsynchIOHandler.cpp:104 #20 0x0000002a95825674 in
boost::detail::function::functor_manager_common<boost::_bi::bind_t<bool,
boost::_mfi::mf2<bool, qpid::sys::AsynchIOHandler, qpid::sys::AsynchIO&,
qpid::sys::AsynchIOBufferBase*>,
boost::_bi::list3<boost::_bi::value<qpid::sys::AsynchIOHandler*>,
boost::arg<1> (*)(), boost::arg<2> (*)()> > >::manage_small
(in_buff...@0xc9e520, out_buff...@0xbe5160,
op=boost::detail::function::clone_functor_tag)
    at /home/rspringer/opt/include/boost/function/function_base.hpp:307
#21 0x0000002a95824dff in
boost::detail::function::functor_manager<boost::_bi::bind_t<void,
boost::_mfi::mf4<void, qpid::sys::AsynchIOProtocolFactory,
boost::shared_ptr<qpid::sys::Poller>, qpid::sys::Socket const&,
qpid::sys::ConnectionCodec::Factory*, bool>,
boost::_bi::list5<boost::_bi::value<qpid::sys::AsynchIOProtocolFactory*>,
boost::_bi::value<boost::shared_ptr<qpid::sys::Poller> >, boost::arg<1>
(*)(), boost::_bi::value<qpid::sys::ConnectionCodec::Factory*>,
boost::_bi::value<bool> > > >::manager (in_buff...@0x3eb8c0d340,
out_buff...@0xbe5168,
    op=boost::detail::function::clone_functor_tag) at
/home/rspringer/opt/include/boost/function/function_base.hpp:395
#22 0x0000002a95824995 in storage5 (this=0x43c04af8, a1={t_ = 0x3eb8c0d300},
a2={t_ = {px = 0x2a96c00128, pn = {pi_ = 0x2a97c4df10}}},
    a3=0x43c04af8, a4={t_ = 0xbe5160}, a5={t_ = false}) at
/home/rspringer/opt/include/boost/bind/storage.hpp:227
#23 0x0000002a9582450b in bind_t (this=0xbe5270, f={f_ = 0xbe5270, this
adjustment 12509232}, l...@0x2a9582450b)
    at /home/rspringer/opt/include/boost/bind/bind.hpp:859
#24 0x0000002a95c4b756 in boost::function2<bool, qpid::sys::AsynchIO&,
qpid::sys::AsynchIOBufferBase*>::operator() (this=0xbe5268, a...@0xbe5160,
    a1=0xbee030) at
/home/rspringer/opt/include/boost/function/function_template.hpp:988
#25 0x0000002a95c49548 in qpid::sys::posix::AsynchIO::readable
(this=0xbe5160, h...@0xbe5168)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/AsynchIO.cpp:446
#26 0x0000002a95c4efd8 in boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
qpid::sys::DispatchHandle&>::operator() (this=0xbe5180, p=0xbe5160,
    a...@0xbe5168) at
/home/rspringer/opt/include/boost/bind/mem_fn_template.hpp:162
#27 0x0000002a95c4e609 in
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()>::operator()<boost::_mfi::mf1<void,
qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
boost::_bi::list1<qpid::sys::DispatchHandle&> > (this=0xbe5190, f...@0xbe5180,
    a...@0x43c04e20) at /home/rspringer/opt/include/boost/bind/bind.hpp:306
#28 0x0000002a95c4dcdd in boost::_bi::bind_t<void, boost::_mfi::mf1<void,
qpid::sys::posix::AsynchIO, qpid::sys::DispatchHandle&>,
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()> >::operator()<qpid::sys::DispatchHandle>
(this=0xbe5180, a...@0xbe5168)
    at /home/rspringer/opt/include/boost/bind/bind_template.hpp:32
#29 0x0000002a95c4d095 in
boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void,
boost::_mfi::mf1<void, qpid::sys::posix::AsynchIO,
qpid::sys::DispatchHandle&>,
boost::_bi::list2<boost::_bi::value<qpid::sys::posix::AsynchIO*>,
boost::arg<1> (*)()> >, void, qpid::sys::DispatchHandle&>::invoke
(function_obj_p...@0xbe5180, a...@0xbe5168) at
/home/rspringer/opt/include/boost/function/function_template.hpp:152
#30 0x0000002a95cb432e in boost::function1<void,
qpid::sys::DispatchHandle&>::operator() (this=0xbe5178, a...@0xbe5168)
    at /home/rspringer/opt/include/boost/function/function_template.hpp:988
#31 0x0000002a95cb3b96 in qpid::sys::DispatchHandle::processEvent
(this=0xbe5168, type=qpid::sys::Poller::READABLE)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/DispatchHandle.cpp:428
#32 0x0000002a95c57d43 in qpid::sys::Poller::Event::process
(this=0x43c05030) at ../../../src/qpid-0.5/cpp/src/qpid/sys/Poller.h:122
#33 0x0000002a95c56a2b in qpid::sys::Poller::run (this=0x533250) at
../../../src/qpid-0.5/cpp/src/qpid/sys/epoll/EpollPoller.cpp:402
#34 0x0000002a95cb21da in qpid::sys::Dispatcher::run (this=0x7fbffff0e0) at
../../../src/qpid-0.5/cpp/src/qpid/sys/Dispatcher.cpp:37
#35 0x0000002a95c510b0 in qpid::sys::(anonymous namespace)::runRunnable
(p=0x7fbffff0e0)
    at ../../../src/qpid-0.5/cpp/src/qpid/sys/posix/Thread.cpp:35
#36 0x0000003eb8c06137 in start_thread () from /lib64/tls/libpthread.so.0
#37 0x0000003eb85c9883 in clone () from /lib64/tls/libc.so.6
#38 0x0000000000000000 in ?? ()




Carl Trieloff wrote:
Rob,

are you able to do a

'thread apply all bt'

on the core. That will allow us to reason through your theory with you

Carl.


[email protected] wrote:
All,
We've been using Qpid M4 and 0.5 internally, both with C++, and have been
seeing occasional segfaults within the broker.  They all seem to occur
inside TopicExchange::isBound() (the one with 3 parameters, not 2).

Looking at the access patterns in this function, and judging by the
core dumps, it appears that multiple threads may access and modify
the class member "bindings" simultaneously, which seems like it
has the potential to cause a segfault.

Since this is a race condition (if we're right), it's hard to
say for sure that the problem has been resolved, but since doing
the below, we've not had a segfault (roughly 5 hours so far,
rather than every 2-3 hours or so).
 - We added locks to all accesses of bindings throughout the
   file.  I believe the new ones were limited to isBound.
 - This may not be necessary at all, but to be safe (while
   we were investigating the problem), we changed all lock
   types from RWlock::ScopedLocks to Mutex::ScopedLocks, to avoid
   being tripped up by pthread (read/write) mutex semantics.

We wanted to get the opinion of the experienced Qpid developers
on this list before opening a bug, but again, so far it seems to
be much more stable (we're hammering on Session::exchangeBound()
in our testing, which may be why we're seeing it in the first place).

I'm not sure if the other exchange types have the same problem or not.

Thanks!
-Rob



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]





Reply via email to