[ https://issues.apache.org/jira/browse/PROTON-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Justin Ross reassigned PROTON-2422: ----------------------------------- Assignee: Clifford Jansen > Proton will sometimes fail to send empty frame if the idle timeout ratio > between peers is greater than 2. > ---------------------------------------------------------------------------------------------------------- > > Key: PROTON-2422 > URL: https://issues.apache.org/jira/browse/PROTON-2422 > Project: Qpid Proton > Issue Type: Bug > Components: cpp-binding, proton-c > Affects Versions: proton-c-0.33.0 > Environment: RHEL 8 > Reporter: Jesse Hulsizer > Assignee: Clifford Jansen > Priority: Minor > Attachments: instrument.patch, reproducer.cpp > > > When a connection is made to a proton listener with both sides having > different idle timeout intervals, the epoll_timer can fail to trigger, > resulting in no empty frames being sent, and the connection dropped with a > 'amqp:resource-limit-exceeded: local-idle-timeout expired' exception. > Instrumentation of the proton library showed that when the an epoll timer > deadline was rolled back and the timer resequenced due to the peer idle > timeout being shorter than the local, the new timer is pushed on the timer > manager heap incorrectly. The timer deadline object should be pushed on the > timer heap in order by deadline, by in fact the timer is pushed on the head > by timer deadline object address. This causes the invalidated timer to be > first on the list, and the proactor timer set incorrectly. When enough time > has elapsed, the remote peer will close the connection due to inactivity. > Note that if the newly created resequenced timer deadline object has an > address lower than the old invalidated timer deadline object, proton will > work correctly. > I've attached a reproducer as well as a patch for the instrumentation. > Annotated proton logging from the reproducer is below. > This issue does not occur prior to 0.33.0 > {code:java} > [builder@SE-RHEL8-ITCM-TEST-01 qpid-proton-idle-timeout-repo $ ] > PN_LOG='frame info+' ./a.out > listening on 9030 > # The initial connection > [0x7fdb3c001be0]: SASL:FRAME: -> SASL > [0x7fdb44002620]: SASL:FRAME: <- SASL > [0x7fdb44002620]: SASL:FRAME: -> SASL > [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-mechanisms(64) > [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]] > [0x7fdb3c001be0]: SASL:FRAME: <- SASL > [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-mechanisms(64) > [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]] > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, > deadline 5189836829, proactor_timer* 0x10B9660 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb3c001be0]: AMQP:FRAME:0 -> @sasl-init(65) [mechanism=:ANONYMOUS, > initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"] > [0x7fdb44002620]: AMQP:FRAME:0 <- @sasl-init(65) [mechanism=:ANONYMOUS, > initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"] > [0x7fdb44002620]: SASL: INFO:Authenticated user: anonymous for anonymous with > mechanism ANONYMOUS > [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-outcome(68) [code=0] > [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-outcome(68) [code=0] > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, > deadline 5189836829, proactor_timer* 0x10B9660 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb3c001be0]: AMQP:FRAME: -> AMQP > [0x7fdb3c001be0]: AMQP:FRAME:0 -> @open(16) > [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", > channel-max=32767, idle-time-out=20000] > [0x7fdb44002620]: AMQP:FRAME: <- AMQP > [0x7fdb44002620]: AMQP:FRAME:0 <- @open(16) > [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", > channel-max=32767, idle-time-out=20000] > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, > deadline 5189806830, proactor_timer* 0x10B7D30 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, > td->list_deadline=5189806830, td->timer=0x00007fdb44008fb0, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb44002620]: AMQP:FRAME: -> AMQP > [0x7fdb44002620]: AMQP:FRAME:0 -> @open(16) > [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, > idle-time-out=5000] > [0x7fdb3c001be0]: AMQP:FRAME: <- AMQP > [0x7fdb3c001be0]: AMQP:FRAME:0 <- @open(16) > [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, > idle-time-out=5000] > # Here the client has connected to the listener. The client timeout is 40 > seconds, and the listener idle timeout is 10 seconds. The proactor timer for > this connection > # is re-sequenced and the new timer deadline object is placed on the heap. > This is placed in address order, and not list_deadline order as it should be. > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, > deadline 5189799330, proactor_timer* 0x10B9660 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x0000000000000000, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, > td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, > td->resequenced=true > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > # After 10 seconds, the listener server has closed the connection due to > inactivity. > [0x7fdb44002620]: AMQP:ERROR:amqp:resource-limit-exceeded local-idle-timeout > expired > [0x7fdb44002620]: AMQP:FRAME:0 -> @close(24) [error=@error(29) > [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout > expired"]] > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, > deadline 5189816830, proactor_timer* 0x10B7D30 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, > td->list_deadline=5189816830, td->timer=0x00007fdb44008fb0, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb44002620]: IO:FRAME: -> EOS > [0x7fdb3c001be0]: AMQP:FRAME:0 <- @close(24) [error=@error(29) > [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout > expired"]] > [0x7fdb3c001be0]: IO:FRAME: <- EOS > [0x7fdb3c001be0]: AMQP:FRAME:0 -> (EMPTY FRAME) > [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, > deadline 5189809330, proactor_timer* 0x10B9660 > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x0000000000000000, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, > td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, > td->resequenced=true > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, > td->list_deadline=5189836829, td->timer=0x0000000000000000, > td->resequenced=false > [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, > td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, > td->resequenced=true > [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump > amqp:resource-limit-exceeded: local-idle-timeout expired > amqp:resource-limit-exceeded: local-idle-timeout expired > Sender was disconnected. Test Failed. > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org