I seem to remember at some point there was a bug in the reactor which
caused events to get stuck and repeated in on_stop. In that case it was
related to scheduled events but this sounds similar.
Very odd that it works differently on different platforms. Perhaps
differences in the behavior of select or error handling when closing file
descriptors? Internally stop() works by writing to a FD to wake up the
select loop and the process of shutting down would close that FD, so
perhaps there's a bug/race condition that is being handled differently on
each platform and causing one to get stuck in a loop.

Deserves a JIRA I think, sounds  like there's a problem somewhere in proton.

On Fri, Feb 23, 2018 at 3:52 AM, andi welchlin <andi.welch...@gmail.com>
wrote:

> When I jump with gdb into clean_stop.py while it hangs:
>
> (gdb) where
> #0  0x00007f586bada827 in futex_abstimed_wait_cancelable (private=0,
> abstime=0x0, expected=0, futex_word=0x7f5864000c10) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1  do_futex_wait (sem=sem@entry=0x7f5864000c10, abstime=0x0) at
> sem_waitcommon.c:111
> #2  0x00007f586bada8d4 in __new_sem_wait_slow (sem=0x7f5864000c10,
> abstime=0x0) at sem_waitcommon.c:181
> #3  0x00007f586bada97a in __new_sem_wait (sem=<optimized out>) at
> sem_wait.c:29
> #4  0x0000000000514195 in PyThread_acquire_lock_timed ()
> #5  0x0000000000517f66 in ?? ()
> #6  0x00000000004e9ba7 in PyCFunction_Call ()
> #7  0x00000000005372f4 in PyEval_EvalFrameEx ()
> #8  0x0000000000540199 in ?? ()
>
>
> And I can confirm that with Debian it does not hang.
>
> Does anyone have an idea what the reason could be?
>
> On Thu, Feb 22, 2018 at 4:49 PM, andi welchlin <andi.welch...@gmail.com>
> wrote:
>
> > My output with tracing:
> >
> > run container
> > on start
> > press enter[0x7f2f6c008f90]:  -> SASL
> > [0x7f2f6c008f90]:  <- SASL
> > [0x7f2f6c008f90]:0 <- @sasl-mechanisms(64) [sasl-server-mechanisms=@PN_
> SYMBOL[:ANONYMOUS,
> > :AMQPLAIN, :PLAIN]]
> > [0x7f2f6c008f90]:0 -> @sasl-init(65) [mechanism=:ANONYMOUS,
> > initial-response=b"anonymous"]
> > [0x7f2f6c008f90]:0 <- @sasl-outcome(68) [code=0]
> > [0x7f2f6c008f90]:  -> AMQP
> > [0x7f2f6c008f90]:0 -> @open(16) [container-id="7c856344-8010-
> 44dc-ac56-bbdcddb52acd",
> > hostname="localhost:amqp", channel-max=32767]
> > [0x7f2f6c008f90]:0 -> @begin(17) [next-outgoing-id=0,
> > incoming-window=2147483647, outgoing-window=2147483647]
> > [0x7f2f6c008f90]:0 -> @attach(18) [name="7c856344-8010-44dc-
> > ac56-bbdcddb52acd-test.awe.queue", handle=0, role=true,
> > snd-settle-mode=2, rcv-settle-mode=0, source=@source(40)
> > [address="test.awe.queue", durable=0, timeout=0, dynamic=false],
> > target=@target(41) [durable=0, timeout=0, dynamic=false],
> > initial-delivery-count=0]
> > [0x7f2f6c008f90]:0 -> @flow(19) [incoming-window=2147483647
> > <(214)%20748-3647>, next-outgoing-id=0, outgoing-window=2147483647,
> > handle=0, delivery-count=0, link-credit=50, drain=false]
> > [0x7f2f6c008f90]:  <- AMQP
> > [0x7f2f6c008f90]:0 <- @open(16) [container-id="rabbit@andreas-
> VirtualBox",
> > channel-max=32767, idle-time-out=60000, properties={:"cluster_name"="
> > rabbit@andreas-VirtualBox", :copyright="Copyright (C) 2007-2015 Pivotal
> > Software, Inc.", :information="Licensed under the MPL.  See
> > http://www.rabbitmq.com/";, :platform="Erlang/OTP", :product="RabbitMQ",
> > :version="3.5.7"}]
> > [0x7f2f6c008f90]:0 <- @begin(17) [remote-channel=0, next-outgoing-id=0,
> > incoming-window=65535, outgoing-window=65535, handle-max=4294967295]
> > [0x7f2f6c008f90]:0 <- @attach(18) [name="7c856344-8010-44dc-
> > ac56-bbdcddb52acd-test.awe.queue", handle=0, role=false,
> > snd-settle-mode=0, rcv-settle-mode=0, source=@source(40)
> > [address="test.awe.queue", durable=0, timeout=0, dynamic=false,
> > default-outcome=@released(38) [], outcomes=@PN_SYMBOL[:"amqp:
> accepted:list",
> > :"amqp:rejected:list", :"amqp:released:list"]], initial-delivery-count=0]
> > *[0x7f2f6c008f90]:0 <- @flow(19) [next-incoming-id=0,
> > incoming-window=65535, next-outgoing-id=0, outgoing-window=65535,
> handle=0,
> > delivery-count=0, link-credit=50, available=0, drain=false]*
> >
> > call stop
> > call join
> > on_stop
> > on_stop
> > on_stop
> > on_stop
> > .... [ a log of stop lines deleted ] ...
> > on_stop
> > on_stop
> > on_stop
> > on_stop
> > Exception ignored in: <object repr() failed>
> > Traceback (most recent call last):
> >   File "/usr/lib/python3/dist-packages/proton/wrapper.py", line 95, in
> > __del__
> >     pn_decref(self._impl)
> >   File "/usr/lib/python3/dist-packages/proton/wrapper.py", line 63, in
> > __getattr__
> >     attrs = self.__dict__["_attrs"]
> > KeyError: ('_attrs',)
> > ^CTraceback (most recent call last):
> >   File "./clean_stop.py", line 67, in <module>
> >     handler._stop()
> >   File "./clean_stop.py", line 44, in _stop
> >     self.thread.join()
> >   File "/usr/lib/python3.5/threading.py", line 1054, in join
> >     self._wait_for_tstate_lock()
> >   File "/usr/lib/python3.5/threading.py", line 1070, in
> > _wait_for_tstate_lock
> >     elif lock.acquire(block, timeout):
> > KeyboardInterrupt
> >
> >
> > On Thu, Feb 22, 2018 at 4:26 PM, Gordon Sim <g...@redhat.com> wrote:
> >
> >> On 22/02/18 15:22, andi welchlin wrote:
> >>
> >>> Hi Gordon,
> >>>
> >>> I saw that your first line is:
> >>>
> >>> #!/usr/bin/env python
> >>>
> >>>
> >>> Does clean_stop.py also work for you when you change it to:
> >>>
> >>> #!/usr/bin/env python3
> >>>
> >>> ?
> >>>
> >>>
> >> Yes (output with tracing on below):
> >>
> >> $ PN_TRACE_FRM=1 ./clean_stop.py run container
> >>> press enteron start
> >>> [0x7f39ac009120]:  -> SASL
> >>> [0x7f39ac009120]:  <- SASL
> >>> [0x7f39ac009120]:0 <- @sasl-mechanisms(64)
> [sasl-server-mechanisms=@PN_SY
> >>> MBOL[:ANONYMOUS]]
> >>> [0x7f39ac009120]:0 -> @sasl-init(65) [mechanism=:ANONYMOUS,
> >>> initial-response=b"anonymous@localhost.localdomain"]
> >>> [0x7f39ac009120]:0 <- @sasl-outcome(68) [code=0]
> >>> [0x7f39ac009120]:  -> AMQP
> >>> [0x7f39ac009120]:0 -> @open(16) [container-id="dc3bcfe2-23b0-
> 47d0-bcaa-848d290b347c",
> >>> hostname="localhost", channel-max=32767]
> >>> [0x7f39ac009120]:0 -> @begin(17) [next-outgoing-id=0, incoming-window=
> >>> 2147483647, outgoing-window=2147483647]
> >>> [0x7f39ac009120]:0 -> @attach(18) [name="dc3bcfe2-23b0-47d0-
> bcaa-848d290b347c-test.awe.queue",
> >>> handle=0, role=true, snd-settle-mode=2, rcv-settle-mode=0,
> >>> source=@source(40) [address="test.awe.queue", durable=0, timeout=0,
> >>> dynamic=false], target=@target(41) [durable=0, timeout=0,
> dynamic=false],
> >>> initial-delivery-count=0, max-message-size=0]
> >>> [0x7f39ac009120]:0 -> @flow(19) [incoming-window=2147483647,
> >>> next-outgoing-id=0, outgoing-window=2147483647, handle=0,
> >>> delivery-count=0, link-credit=50, drain=false]
> >>> [0x7f39ac009120]:  <- AMQP
> >>> [0x7f39ac009120]:0 <- @open(16) [container-id="Router.A",
> >>> max-frame-size=16384, channel-max=32767, idle-time-out=8000,
> >>> offered-capabilities=:"ANONYMOUS-RELAY", properties={:product="qpid-
> dispatch-router",
> >>> :version="1.0.0"}]
> >>> [0x7f39ac009120]:0 <- @begin(17) [remote-channel=0, next-outgoing-id=0,
> >>> incoming-window=2147483647, outgoing-window=2147483647]
> >>> [0x7f39ac009120]:0 <- @attach(18) [name="dc3bcfe2-23b0-47d0-
> bcaa-848d290b347c-test.awe.queue",
> >>> handle=0, role=false, snd-settle-mode=2, rcv-settle-mode=0,
> >>> source=@source(40) [address="test.awe.queue", durable=0, timeout=0,
> >>> dynamic=false], target=@target(41) [durable=0, timeout=0,
> dynamic=false],
> >>> initial-delivery-count=0, max-message-size=0]
> >>>
> >>> call stop
> >>> call join
> >>> on_stop
> >>> after container.run()
> >>> done stop
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> >> For additional commands, e-mail: users-h...@qpid.apache.org
> >>
> >>
> >
>

Reply via email to