[ 
https://issues.apache.org/jira/browse/TS-4838?focusedWorklogId=31730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-31730
 ]

ASF GitHub Bot logged work on TS-4838:
--------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Nov/16 17:16
            Start Date: 07/Nov/16 17:16
    Worklog Time Spent: 10m 
      Work Description: GitHub user PSUdaemon opened a pull request:

    https://github.com/apache/trafficserver/pull/1206

    TS-4838: CONNECT requests get forgotten across threads.

    What happens here is that ProxyClientTransaction::adjust_thread
    reschedules the transaction onto a new thread at the start of
    HttpSM::do_http_server_open.
    
    Unfortunately, at this point the default handler is
    HttpSM::state_raw_http_server_open. When the transaction is
    rescheduled, the default handler runs, and receives the EVENT_INTERVAL
    that it so fortuitously logs an error for. We have never actually
    completed do_http_server_open, so we never make any more progress
    on this transaction.
    
    (cherry picked from commit 8fddd77c085d1a64f11de61bb42a50562cd23229)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PSUdaemon/trafficserver bp-1002

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafficserver/pull/1206.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1206
    
----
commit c5ab2e686ac0dad4ebe89573cdcc0b2d2a6359a4
Author: James Peach <jpe...@apache.org>
Date:   2016-09-09T22:29:05Z

    TS-4838: CONNECT requests get forgotten across threads.
    
    What happens here is that ProxyClientTransaction::adjust_thread
    reschedules the transaction onto a new thread at the start of
    HttpSM::do_http_server_open.
    
    Unfortunately, at this point the default handler is
    HttpSM::state_raw_http_server_open. When the transaction is
    rescheduled, the default handler runs, and receives the EVENT_INTERVAL
    that it so fortuitously logs an error for. We have never actually
    completed do_http_server_open, so we never make any more progress
    on this transaction.
    
    (cherry picked from commit 8fddd77c085d1a64f11de61bb42a50562cd23229)

----


Issue Time Tracking
-------------------

    Worklog Id:     (was: 31730)
    Time Spent: 1h  (was: 50m)

> After TS-3612 restructuring, very slow SSL sessions and 
> HttpSM::state_raw_http_server_open errors
> -------------------------------------------------------------------------------------------------
>
>                 Key: TS-4838
>                 URL: https://issues.apache.org/jira/browse/TS-4838
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, SSL
>    Affects Versions: 6.2.0, 7.0.0
>         Environment: CentOS/RHEL 7.2, x86_64
>            Reporter: Dimitry Andric
>            Assignee: James Peach
>             Fix For: 7.0.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> We have been using TrafficServer 5.3.2 for quite some time now, for forward 
> proxying of a number of different HTML5 applications, one of the most 
> important ones being YouTube's TV interface, e.g. https://youtube.com/tv.  
> This is all hosted on CentOS 7.2 x86_64 machines.
> We recently upgraded to 6.2.0, and then started having problems with the 
> CONNECT requests for port 443 which are generated by the YouTube app.  It 
> seems like these connections are "stalled" somehow, sometimes for >10 
> seconds.  Meanwhile, {{diags.log}} is getting spammed lots of the following:
> {noformat}
> [Sep  9 16:45:47.683] Server {0x2b3e50c0b700} ERROR: 
> [HttpSM::state_raw_http_server_open] event: EVENT_INTERVAL state: 0 
> server_entry: (nil)
> {noformat}
> Requests that seem to stall are most likely all of the CONNECT kind, e.g.:
> {noformat}
> 1473432382.474 30405 127.0.0.1 TCP_MISS/200 4916 CONNECT 
> ad.doubleclick.net:443/ - DIRECT/ad.doubleclick.net -
> 1473432382.481 30411 127.0.0.1 TCP_MISS/200 54024 CONNECT i9.ytimg.com:443/ - 
> DIRECT/i9.ytimg.com -
> 1473432382.486 30417 127.0.0.1 TCP_MISS/200 5389 CONNECT 
> pagead2.googlesyndication.com:443/ - DIRECT/pagead2.googlesyndication.com -
> 1473432390.451 42772 127.0.0.1 TCP_MISS/200 5198 CONNECT csi.gstatic.com:443/ 
> - DIRECT/csi.gstatic.com -
> 1473432390.459 43833 127.0.0.1 TCP_MISS/200 11610 CONNECT 
> www.youtube.com:443/ - DIRECT/www.youtube.com -
> 1473432390.483 38414 127.0.0.1 TCP_MISS/200 2870983 CONNECT 
> r17---sn-5hnednl7.googlevideo.com:443/ - 
> DIRECT/r17---sn-5hnednl7.googlevideo.com -
> {noformat}
> As part of figuring out how to diagnose this, I tried a downgrade to 
> TrafficServer 6.1.1, and this made all the stalling and problems disappear.  
> Afterwards, I did a {{git bisect}} on master, from the branch point of 6.1 to 
> the branch point of 6.2, and I ended up at [commit 
> af76977|https://git-dual.apache.org/repos/asf?p=trafficserver.git;a=commit;h=af76977adb9f3c0296a232688bbcb5a1421a6768]:
> {quote}
> Author: Susan Hinrichs <shinr...@draggingnagging.corp.ne1.yahoo.com>
> Date:   Wed Apr 13 19:57:39 2016 +0000
>     TS-3612: Restructure client session and transaction processing. This 
> closes #570.
> {quote}
> Unfortunately, this is a quite big refactoring commit, so it is not possible 
> to revert it individually to see whether it improves things.
> I read TS-3612 and #570, and I saw there were also a number of follow-up 
> commits to fix various problems with it, but this particular problem of 
> stalled SSL connections is still occurring with master as of today, 
> 2016-09-09.
> I realize that this report is still missing reproduction details, since it is 
> tricky to analyze what the YouTube app is doing, and simple {{curl https://}} 
> tests appear to go fast, and don't seem to trigger any stalling.  But YouTube 
> itself is pretty easy to try out, I think.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to