[
https://issues.apache.org/jira/browse/TS-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Peach resolved TS-4838.
-----------------------------
Resolution: Fixed
> After TS-3612 restructuring, very slow SSL sessions and
> HttpSM::state_raw_http_server_open errors
> -------------------------------------------------------------------------------------------------
>
> Key: TS-4838
> URL: https://issues.apache.org/jira/browse/TS-4838
> Project: Traffic Server
> Issue Type: Bug
> Components: Core, SSL
> Affects Versions: 6.2.0, 7.0.0
> Environment: CentOS/RHEL 7.2, x86_64
> Reporter: Dimitry Andric
> Assignee: James Peach
> Fix For: 7.0.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> We have been using TrafficServer 5.3.2 for quite some time now, for forward
> proxying of a number of different HTML5 applications, one of the most
> important ones being YouTube's TV interface, e.g. https://youtube.com/tv.
> This is all hosted on CentOS 7.2 x86_64 machines.
> We recently upgraded to 6.2.0, and then started having problems with the
> CONNECT requests for port 443 which are generated by the YouTube app. It
> seems like these connections are "stalled" somehow, sometimes for >10
> seconds. Meanwhile, {{diags.log}} is getting spammed lots of the following:
> {noformat}
> [Sep 9 16:45:47.683] Server {0x2b3e50c0b700} ERROR:
> [HttpSM::state_raw_http_server_open] event: EVENT_INTERVAL state: 0
> server_entry: (nil)
> {noformat}
> Requests that seem to stall are most likely all of the CONNECT kind, e.g.:
> {noformat}
> 1473432382.474 30405 127.0.0.1 TCP_MISS/200 4916 CONNECT
> ad.doubleclick.net:443/ - DIRECT/ad.doubleclick.net -
> 1473432382.481 30411 127.0.0.1 TCP_MISS/200 54024 CONNECT i9.ytimg.com:443/ -
> DIRECT/i9.ytimg.com -
> 1473432382.486 30417 127.0.0.1 TCP_MISS/200 5389 CONNECT
> pagead2.googlesyndication.com:443/ - DIRECT/pagead2.googlesyndication.com -
> 1473432390.451 42772 127.0.0.1 TCP_MISS/200 5198 CONNECT csi.gstatic.com:443/
> - DIRECT/csi.gstatic.com -
> 1473432390.459 43833 127.0.0.1 TCP_MISS/200 11610 CONNECT
> www.youtube.com:443/ - DIRECT/www.youtube.com -
> 1473432390.483 38414 127.0.0.1 TCP_MISS/200 2870983 CONNECT
> r17---sn-5hnednl7.googlevideo.com:443/ -
> DIRECT/r17---sn-5hnednl7.googlevideo.com -
> {noformat}
> As part of figuring out how to diagnose this, I tried a downgrade to
> TrafficServer 6.1.1, and this made all the stalling and problems disappear.
> Afterwards, I did a {{git bisect}} on master, from the branch point of 6.1 to
> the branch point of 6.2, and I ended up at [commit
> af76977|https://git-dual.apache.org/repos/asf?p=trafficserver.git;a=commit;h=af76977adb9f3c0296a232688bbcb5a1421a6768]:
> {quote}
> Author: Susan Hinrichs <[email protected]>
> Date: Wed Apr 13 19:57:39 2016 +0000
> TS-3612: Restructure client session and transaction processing. This
> closes #570.
> {quote}
> Unfortunately, this is a quite big refactoring commit, so it is not possible
> to revert it individually to see whether it improves things.
> I read TS-3612 and #570, and I saw there were also a number of follow-up
> commits to fix various problems with it, but this particular problem of
> stalled SSL connections is still occurring with master as of today,
> 2016-09-09.
> I realize that this report is still missing reproduction details, since it is
> tricky to analyze what the YouTube app is doing, and simple {{curl https://}}
> tests appear to go fast, and don't seem to trigger any stalling. But YouTube
> itself is pretty easy to try out, I think.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)