[ 
https://issues.apache.org/jira/browse/TS-4916?focusedWorklogId=30610&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-30610
 ]

ASF GitHub Bot logged work on TS-4916:
--------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Oct/16 19:29
            Start Date: 13/Oct/16 19:29
    Worklog Time Spent: 10m 
      Work Description: Github user shinrich commented on the issue:

    https://github.com/apache/trafficserver/pull/1100
  
    Not clear we want to move this to 7.0.  The main changes have already been 
applied (check to see if stream is still in list before deleting and 
appropriately signal that the stream should be removed from the 
Session/ConnState stream list).  May want to look at the backport of TS-4507 to 
6.2.


Issue Time Tracking
-------------------

    Worklog Id:     (was: 30610)
    Time Spent: 4h 10m  (was: 4h)

> Http2ConnectionState::restart_streams infinite loop causes deadlock 
> --------------------------------------------------------------------
>
>                 Key: TS-4916
>                 URL: https://issues.apache.org/jira/browse/TS-4916
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP/2
>            Reporter: Gancho Tenev
>            Assignee: Gancho Tenev
>            Priority: Blocker
>             Fix For: 7.1.0
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Http2ConnectionState::restart_streams falls into an infinite loop while 
> holding a lock, which leads to cache updates to start failing.
> The infinite loop is caused by traversing a list whose last element “next” 
> points to the element itself and the traversal never finishes.
> {code}
> Thread 51 (Thread 0x2aaab3d04700 (LWP 34270)):
> #0  0x00002aaaaacf3fee in Http2ConnectionState::restart_streams 
> (this=0x2ae6ba5284c8) at Http2ConnectionState.cc:913
> #1  rcv_window_update_frame (cstate=..., frame=...) at 
> Http2ConnectionState.cc:627
> #2  0x00002aaaaacf9738 in Http2ConnectionState::main_event_handler 
> (this=0x2ae6ba5284c8, event=<optimized out>, edata=<optimized out>) at 
> Http2ConnectionState.cc:823
> #3  0x00002aaaaacef1c3 in Continuation::handleEvent (data=0x2aaab3d039a0, 
> event=2253, this=0x2ae6ba5284c8) at 
> ../../iocore/eventsystem/I_Continuation.h:153
> #4  send_connection_event (cont=cont@entry=0x2ae6ba5284c8, 
> event=event@entry=2253, edata=edata@entry=0x2aaab3d039a0) at 
> Http2ClientSession.cc:58
> #5  0x00002aaaaacef462 in Http2ClientSession::state_complete_frame_read 
> (this=0x2ae6ba528290, event=<optimized out>, edata=0x2aab7b237f18) at 
> Http2ClientSession.cc:426
> #6  0x00002aaaaacf0982 in Continuation::handleEvent (data=0x2aab7b237f18, 
> event=100, this=0x2ae6ba528290) at 
> ../../iocore/eventsystem/I_Continuation.h:153
> #7  Http2ClientSession::state_start_frame_read (this=0x2ae6ba528290, 
> event=<optimized out>, edata=0x2aab7b237f18) at Http2ClientSession.cc:399
> #8  0x00002aaaaacef5a3 in Continuation::handleEvent (data=0x2aab7b237f18, 
> event=100, this=0x2ae6ba528290) at 
> ../../iocore/eventsystem/I_Continuation.h:153
> #9  Http2ClientSession::state_complete_frame_read (this=0x2ae6ba528290, 
> event=<optimized out>, edata=0x2aab7b237f18) at Http2ClientSession.cc:431
> #10 0x00002aaaaacf0982 in Continuation::handleEvent (data=0x2aab7b237f18, 
> event=100, this=0x2ae6ba528290) at 
> ../../iocore/eventsystem/I_Continuation.h:153
> #11 Http2ClientSession::state_start_frame_read (this=0x2ae6ba528290, 
> event=<optimized out>, edata=0x2aab7b237f18) at Http2ClientSession.cc:399
> #12 0x00002aaaaae67e2b in Continuation::handleEvent (data=0x2aab7b237f18, 
> event=100, this=<optimized out>) at 
> ../../iocore/eventsystem/I_Continuation.h:153
> #13 read_signal_and_update (vc=0x2aab7b237e00, vc@entry=0x1, 
> event=event@entry=100) at UnixNetVConnection.cc:153
> #14 UnixNetVConnection::readSignalAndUpdate (this=this@entry=0x2aab7b237e00, 
> event=event@entry=100) at UnixNetVConnection.cc:1036
> #15 0x00002aaaaae47653 in SSLNetVConnection::net_read_io 
> (this=0x2aab7b237e00, nh=0x2aaab2409cc0, lthread=0x2aaab2406000) at 
> SSLNetVConnection.cc:595
> #16 0x00002aaaaae5558c in NetHandler::mainNetEvent (this=0x2aaab2409cc0, 
> event=<optimized out>, e=<optimized out>) at UnixNet.cc:513
> #17 0x00002aaaaae8d2e6 in Continuation::handleEvent (data=0x2aaab0bfa700, 
> event=5, this=<optimized out>) at I_Continuation.h:153
> #18 EThread::process_event (calling_code=5, e=0x2aaab0bfa700, 
> this=0x2aaab2406000) at UnixEThread.cc:148
> #19 EThread::execute (this=0x2aaab2406000) at UnixEThread.cc:275
> #20 0x00002aaaaae8c0e6 in spawn_thread_internal (a=0x2aaab0b25bb0) at 
> Thread.cc:86
> #21 0x00002aaaad6b3aa1 in start_thread (arg=0x2aaab3d04700) at 
> pthread_create.c:301
> #22 0x00002aaaae8bc93d in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
> {code}
> Here is the stream_list trace.
> {code}
> (gdb) thread 51
> [Switching to thread 51 (Thread 0x2aaab3d04700 (LWP 34270))]
> #0  0x00002aaaaacf3fee in Http2ConnectionState::restart_streams 
> (this=0x2ae6ba5284c8) at Http2ConnectionState.cc:913
> (gdb) trace_list stream_list
> ------- count=0 -------
> id=29
> this=0x2ae673f0c840
> next=0x2aaac05d8900
> prev=(nil)
> ------- count=1 -------
> id=27
> this=0x2aaac05d8900
> next=0x2ae5b6bbec00
> prev=0x2ae673f0c840
> ------- count=2 -------
> id=19
> this=0x2ae5b6bbec00
> next=0x2ae5b6bbec00
> prev=0x2aaac05d8900
> ------- count=3 -------
> id=19
> this=0x2ae5b6bbec00
> next=0x2ae5b6bbec00
> prev=0x2aaac05d8900
> . . . 
> ------- count=5560 -------
> id=19
> this=0x2ae5b6bbec00
> next=0x2ae5b6bbec00
> prev=0x2aaac05d8900
> . . .
> {code}
> Currently I am working on finding out why the list in question got into this 
> “impossible” (broken) state and and eventually coming up with a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to