[ 
https://issues.apache.org/jira/browse/PROTON-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Ross reassigned PROTON-2422:
-----------------------------------

    Assignee: Clifford Jansen

> Proton will sometimes fail to send empty frame if the idle timeout ratio 
> between peers is greater than 2. 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: PROTON-2422
>                 URL: https://issues.apache.org/jira/browse/PROTON-2422
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: cpp-binding, proton-c
>    Affects Versions: proton-c-0.33.0
>         Environment: RHEL 8
>            Reporter: Jesse Hulsizer
>            Assignee: Clifford Jansen
>            Priority: Minor
>         Attachments: instrument.patch, reproducer.cpp
>
>
> When a connection is made to a proton listener with both sides having 
> different idle timeout intervals, the epoll_timer can fail to trigger, 
> resulting in no empty frames being sent, and the connection dropped with a 
> 'amqp:resource-limit-exceeded: local-idle-timeout expired' exception.
> Instrumentation of the proton library showed that when the an epoll timer 
> deadline was rolled back and the timer resequenced due to the peer idle 
> timeout being shorter than the local, the new timer is pushed on the timer 
> manager heap incorrectly. The timer deadline object should be pushed on the 
> timer heap in order by deadline, by in fact the timer is pushed on the head 
> by timer deadline object address. This causes the invalidated timer to be 
> first on the list, and the proactor timer set incorrectly. When enough time 
> has elapsed, the remote peer will close the connection due to inactivity. 
> Note that if the newly created resequenced timer deadline object has an 
> address lower than the old invalidated timer deadline object, proton will 
> work correctly.
> I've attached a reproducer as well as a patch for the instrumentation. 
> Annotated proton logging from the reproducer is below.
> This issue does not occur prior to 0.33.0
> {code:java}
> [builder@SE-RHEL8-ITCM-TEST-01 qpid-proton-idle-timeout-repo $ ] 
> PN_LOG='frame info+' ./a.out 
> listening on 9030
> # The initial connection
> [0x7fdb3c001be0]: SASL:FRAME: -> SASL
> [0x7fdb44002620]: SASL:FRAME: <- SASL
> [0x7fdb44002620]: SASL:FRAME: -> SASL
> [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-mechanisms(64) 
> [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
> [0x7fdb3c001be0]: SASL:FRAME: <- SASL
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-mechanisms(64) 
> [sasl-server-mechanisms=@PN_SYMBOL[:ANONYMOUS]]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189836829, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb3c001be0]: AMQP:FRAME:0 -> @sasl-init(65) [mechanism=:ANONYMOUS, 
> initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
> [0x7fdb44002620]: AMQP:FRAME:0 <- @sasl-init(65) [mechanism=:ANONYMOUS, 
> initial-response=b"anonymous@SE-RHEL8-ITCM-TEST-01"]
> [0x7fdb44002620]: SASL: INFO:Authenticated user: anonymous for anonymous with 
> mechanism ANONYMOUS
> [0x7fdb44002620]: AMQP:FRAME:0 -> @sasl-outcome(68) [code=0]
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @sasl-outcome(68) [code=0]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189836829, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb3c001be0]: AMQP:FRAME: -> AMQP
> [0x7fdb3c001be0]: AMQP:FRAME:0 -> @open(16) 
> [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", 
> channel-max=32767, idle-time-out=20000]
> [0x7fdb44002620]: AMQP:FRAME: <- AMQP
> [0x7fdb44002620]: AMQP:FRAME:0 <- @open(16) 
> [container-id="cf87e911-f46b-471a-a664-e34de8a57b6b", hostname="127.0.0.1", 
> channel-max=32767, idle-time-out=20000]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, 
> deadline 5189806830, proactor_timer* 0x10B7D30
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, 
> td->list_deadline=5189806830, td->timer=0x00007fdb44008fb0, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb44002620]: AMQP:FRAME: -> AMQP
> [0x7fdb44002620]: AMQP:FRAME:0 -> @open(16) 
> [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, 
> idle-time-out=5000]
> [0x7fdb3c001be0]: AMQP:FRAME: <- AMQP
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @open(16) 
> [container-id="6fb1e956-e9ca-4922-95a7-072fc249dc24", channel-max=32767, 
> idle-time-out=5000]
> # Here the client has connected to the listener. The client timeout is 40 
> seconds, and the listener idle timeout is 10 seconds. The proactor timer for 
> this connection
> # is re-sequenced and the new timer deadline object is placed on the heap. 
> This is placed in address order, and not list_deadline order as it should be.
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189799330, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x00007fdb3c008570, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x0000000000000000, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, 
> td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, 
> td->resequenced=true
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> # After 10 seconds, the listener server has closed the connection due to 
> inactivity.
> [0x7fdb44002620]: AMQP:ERROR:amqp:resource-limit-exceeded local-idle-timeout 
> expired
> [0x7fdb44002620]: AMQP:FRAME:0 -> @close(24) [error=@error(29) 
> [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout 
> expired"]]
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb44008fb0, 
> deadline 5189816830, proactor_timer* 0x10B7D30
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb44008ff0, 
> td->list_deadline=5189816830, td->timer=0x00007fdb44008fb0, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb44002620]: IO:FRAME: -> EOS
> [0x7fdb3c001be0]: AMQP:FRAME:0 <- @close(24) [error=@error(29) 
> [condition=:"amqp:resource-limit-exceeded", description="local-idle-timeout 
> expired"]]
> [0x7fdb3c001be0]: IO:FRAME: <- EOS
> [0x7fdb3c001be0]: AMQP:FRAME:0 -> (EMPTY FRAME)
> [0x7fdb4ca21e20]:EVENT: INFO:In pni_timer_set - timer* 0x7fdb3c008570, 
> deadline 5189809330, proactor_timer* 0x10B9660
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x0000000000000000, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, 
> td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, 
> td->resequenced=true
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> [0x7fdb4ca21e20]:EVENT: INFO:Start of timer heap dump post
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 0: td=0x0167fdb3c0085b0, 
> td->list_deadline=5189836829, td->timer=0x0000000000000000, 
> td->resequenced=false
> [0x7fdb4ca21e20]:EVENT: INFO:Heap position 1: td=0x0167fdb3c00a930, 
> td->list_deadline=5189799330, td->timer=0x00007fdb3c008570, 
> td->resequenced=true
> [0x7fdb4ca21e20]:EVENT: INFO:Stop of timer heap dump
> amqp:resource-limit-exceeded: local-idle-timeout expired
> amqp:resource-limit-exceeded: local-idle-timeout expired
> Sender was disconnected. Test Failed.
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to