Re: [External] Re: "Session detached by peer"; extra connections on broker, lock-up on send

Toralf Lund Wed, 18 Dec 2019 04:34:13 -0800

On 18/12/2019 12:35, Gordon Sim wrote:

On 18/12/2019 10:42 am, Toralf Lund wrote:
On 18/12/2019 10:40, Gordon Sim wrote:
On 18/12/2019 8:45 am, Toralf Lund wrote:
An additional, more serious issue is that the system has alsolocked up a couple of times following an exception duringSender::send(). Slightly simplified stack trace from one of the cases:
Thread 5 (Thread 0x7fb93bc7c700 (LWP 134270)):
#0  0x00007fb9400bd113 in epoll_wait () from /usr/lib64/libc.so.6
#1 0x00007fb93f9c7677 inqpid::sys::Poller::wait(qpid::sys::Duration) ()
    from /usr/lib64/libqpidcommon.so.2
#2  0x00007fb93f9c9d5f in qpid::sys::Poller::run() ()
    from /usr/lib64/libqpidcommon.so.2
#3  0x00007fb93f9b8e4a in ?? ()
    from /usr/lib64/libqpidcommon.so.2
#4 0x00007fb941663dd5 in start_thread () from/usr/lib64/libpthread.so.0
#5  0x00007fb9400bcb3d in clone () from /usr/lib64/libc.so.6

Thread 4 (Thread 0x7fb93b47b700 (LWP 134271)):
#0  0x00007fb941667cf2 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
    from /usr/lib64/libpthread.so.0
#1  0x00007fb93fa35b35 in qpid::sys::Timer::run() ()
    from /usr/lib64/libqpidcommon.so.2
#2  0x00007fb93f9b8e4a in ?? ()
    from /usr/lib64/libqpidcommon.so.2
#3 0x00007fb941663dd5 in start_thread () from/usr/lib64/libpthread.so.0
#4  0x00007fb9400bcb3d in clone () from /usr/lib64/libc.so.6

Thread 3 (Thread 0x7fb93987f700 (LWP 134344)):
[ ... ]
Thread 3's trace didn't make it through. Do you still have that?
Thread 2 (Thread 0x7fb93907e700 (LWP 395509)):
#0  0x00007fb9400bd113 in epoll_wait () from /usr/lib64/libc.so.6
#1 0x00007fb93f9c7677 inqpid::sys::Poller::wait(qpid::sys::Duration) ()
    from /usr/lib64/libqpidcommon.so.2
#2  0x00007fb93f9c9d5f in qpid::sys::Poller::run() ()
    from /usr/lib64/libqpidcommon.so.2
#3  0x00007fb93f9b8e4a in ?? ()
    from /usr/lib64/libqpidcommon.so.2
#4 0x00007fb941663dd5 in start_thread () from/usr/lib64/libpthread.so.0
#5  0x00007fb9400bcb3d in clone () from /usr/lib64/libc.so.6

Thread 1 (Thread 0x7fb942671300 (LWP 134233)):
#0  0x00007fb94039850f in ?? () from /usr/lib64/libgcc_s.so.1
#1  0x00007fb940399f5f in ?? () from /usr/lib64/libgcc_s.so.1
#2  0x00007fb94039a8ca in ?? () from /usr/lib64/libgcc_s.so.1
#3 0x00007fb94039add7 in _Unwind_Resume () from/usr/lib64/libgcc_s.so.1#4 0x00007fb93fd8b927 inqpid::client::SessionImpl::sendCommand(qpid::framing::AMQBodyconst&, qpid::framing::MethodContent const*) ()
    from /usr/lib64/libqpidclient.so.2
#5 0x00007fb93fd8b97b inqpid::client::SessionImpl::send(qpid::framing::AMQBody const&) ()from /usr/lib64/libqpidclient.so.2
#6  0x00007fb93fd876a6 in qpid::client::SessionBase_0_10::sync() ()
    from /usr/lib64/libqpidclient.so.2
#7  0x00007fb9422403f8 in ?? ()
    from /usr/lib64/libqpidmessaging.so.2
#8  0x00007fb942241e6a in ?? ()
    from /usr/lib64/libqpidmessaging.so.2
#9  0x00000000004f2dc9 in ?? ()
It looks like the send is stuck trying to unwind after an exception(possibly on the checkOpen()), due to not being able to unlock thesendLock semaphore. Is something in thread 3 holding any locks?
It may be holding locks, but not anything related to QPid.Specifically, it's probably waiting for a condition viapthread_cond_wait(), which implies that it's also holding a mutexassociated with the condition variable.
Ok, thread 1 may of course not be stuck at all, and it was justcaptured at that point. Getting pstack output separated by a fewseconds would shed more light.
Does the frequency of the 'locking up' match the frequency of thesession-busy exceptions?

I get the exact behaviour I'm seeing now only in a version that wasdeployed about a week ago, and since then, between 60 and 70session-busy exceptions have been registered. Most of these came fromSession::nextReceiver() or Receiver::fetch(), but exactly 2 were raisedby Sender::send(). In both these cases, the application locked up moreor less directly - the session error was at least the last item loggedwhen the process was in the locked state. It has not got stuck in anyother situations.

Due to the way exception handling is implemented, there is a chance thatthere is another attempt at sending data after an exception fromSender::send(), without resetting the session or anything. On the otherhand, a receive/nextReceiver() exception is in practice always followeddirectly by


  if(session.isValid() && session.hasError()) {
    Util::debug(1, "Reset AMQP session with error...");
    session.close();
    session=qpid::messaging::Session();
  }

The same code is also executed eventually in the send case, but theremay be other intervening Sender::send() calls, and also operations likeSession::getSender() or even (at least in theory) Session::createSender().


- Toralf



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Re: [External] Re: "Session detached by peer"; extra connections on broker, lock-up on send

Reply via email to