[ 
https://issues.apache.org/jira/browse/TS-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Sorber updated TS-4838:
----------------------------
    Backport to Version:   (was: 6.2.1)

> After TS-3612 restructuring, very slow SSL sessions and 
> HttpSM::state_raw_http_server_open errors
> -------------------------------------------------------------------------------------------------
>
>                 Key: TS-4838
>                 URL: https://issues.apache.org/jira/browse/TS-4838
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, SSL
>    Affects Versions: 6.2.0, 7.0.0
>         Environment: CentOS/RHEL 7.2, x86_64
>            Reporter: Dimitry Andric
>            Assignee: James Peach
>             Fix For: 6.2.1, 7.0.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We have been using TrafficServer 5.3.2 for quite some time now, for forward 
> proxying of a number of different HTML5 applications, one of the most 
> important ones being YouTube's TV interface, e.g. https://youtube.com/tv.  
> This is all hosted on CentOS 7.2 x86_64 machines.
> We recently upgraded to 6.2.0, and then started having problems with the 
> CONNECT requests for port 443 which are generated by the YouTube app.  It 
> seems like these connections are "stalled" somehow, sometimes for >10 
> seconds.  Meanwhile, {{diags.log}} is getting spammed lots of the following:
> {noformat}
> [Sep  9 16:45:47.683] Server {0x2b3e50c0b700} ERROR: 
> [HttpSM::state_raw_http_server_open] event: EVENT_INTERVAL state: 0 
> server_entry: (nil)
> {noformat}
> Requests that seem to stall are most likely all of the CONNECT kind, e.g.:
> {noformat}
> 1473432382.474 30405 127.0.0.1 TCP_MISS/200 4916 CONNECT 
> ad.doubleclick.net:443/ - DIRECT/ad.doubleclick.net -
> 1473432382.481 30411 127.0.0.1 TCP_MISS/200 54024 CONNECT i9.ytimg.com:443/ - 
> DIRECT/i9.ytimg.com -
> 1473432382.486 30417 127.0.0.1 TCP_MISS/200 5389 CONNECT 
> pagead2.googlesyndication.com:443/ - DIRECT/pagead2.googlesyndication.com -
> 1473432390.451 42772 127.0.0.1 TCP_MISS/200 5198 CONNECT csi.gstatic.com:443/ 
> - DIRECT/csi.gstatic.com -
> 1473432390.459 43833 127.0.0.1 TCP_MISS/200 11610 CONNECT 
> www.youtube.com:443/ - DIRECT/www.youtube.com -
> 1473432390.483 38414 127.0.0.1 TCP_MISS/200 2870983 CONNECT 
> r17---sn-5hnednl7.googlevideo.com:443/ - 
> DIRECT/r17---sn-5hnednl7.googlevideo.com -
> {noformat}
> As part of figuring out how to diagnose this, I tried a downgrade to 
> TrafficServer 6.1.1, and this made all the stalling and problems disappear.  
> Afterwards, I did a {{git bisect}} on master, from the branch point of 6.1 to 
> the branch point of 6.2, and I ended up at [commit 
> af76977|https://git-dual.apache.org/repos/asf?p=trafficserver.git;a=commit;h=af76977adb9f3c0296a232688bbcb5a1421a6768]:
> {quote}
> Author: Susan Hinrichs <shinr...@draggingnagging.corp.ne1.yahoo.com>
> Date:   Wed Apr 13 19:57:39 2016 +0000
>     TS-3612: Restructure client session and transaction processing. This 
> closes #570.
> {quote}
> Unfortunately, this is a quite big refactoring commit, so it is not possible 
> to revert it individually to see whether it improves things.
> I read TS-3612 and #570, and I saw there were also a number of follow-up 
> commits to fix various problems with it, but this particular problem of 
> stalled SSL connections is still occurring with master as of today, 
> 2016-09-09.
> I realize that this report is still missing reproduction details, since it is 
> tricky to analyze what the YouTube app is doing, and simple {{curl https://}} 
> tests appear to go fast, and don't seem to trigger any stalling.  But YouTube 
> itself is pretty easy to try out, I think.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to