[
https://issues.apache.org/jira/browse/TS-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173144#comment-15173144
]
taoyunxing commented on TS-4215:
--------------------------------
I turned on http diags, observed and found that http chokes may occur on all
the following places on the worst case:
cache lookup, send request , receive response, http tunnel, ...
-------------------------------------------------------------------------------------------------------------------------------------------------------
112552 [Feb 23 10:19:44.415] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:1921 (DecideCacheLookup)> (http_trans) Next action
SM_ACTION_CACHE_LOOKUP; __null
112553 [Feb 23 10:19:44.418] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:6850
(call_transact_and_set_next_state)> (http) [24146] State Transition:
SM_ACTION_API_POST_REMAP -> SM_ACTION_CACHE_LOOKUP
112554 [Feb 23 10:19:44.419] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:4384
(do_cache_lookup_and_read)> (http_seq) [HttpSM::do_cache_lookup_and_read]
[24146] Issuing cache lookup for URL http://www.taobao.com/
116043 [Feb 23 10:19:47.402] Server {0x2b1b75e1d700} DEBUG:
<HttpCacheSM.cc:102 (state_cache_open_read)> (http_cache) [24146]
[&HttpCacheSM::state_cache_open_read, CACHE_EVENT_OPEN_READ_FAILED]
116044 [Feb 23 10:19:47.403] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2532
(main_handler)> (http) [24146] [HttpSM::main_handler,
CACHE_EVENT_OPEN_READ_FAILED]
116045 [Feb 23 10:19:47.403] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2464
(state_cache_open_read)> (http) [24146] [&HttpSM::state_cache_open_read,
CACHE_EVENT_OPEN_READ_FAILED]
116046 [Feb 23 10:19:47.403] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2499
(state_cache_open_read)> (http) [24146] cache_open_read -
CACHE_EVENT_OPEN_READ_FAILED
116047 [Feb 23 10:19:47.405] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2500
(state_cache_open_read)> (http) [state_cache_open_read] open read failed.
116048 [Feb 23 10:19:47.406] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:2133 (HandleCacheOpenRead)> (http_trans)
[HttpTransact::HandleCacheOpenRead]
116049 [Feb 23 10:19:47.406] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:2163 (HandleCacheOpenRead)> (http_trans) CacheOpenRead -- miss
116050 [Feb 23 10:19:47.407] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:2170 (HandleCacheOpenRead)> (http_trans) Next action
SM_ACTION_DNS_LOOKUP; OSDNSLookup
116051 [Feb 23 10:19:47.407] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:6850
(call_transact_and_set_next_state)> (http) [24146] State Transition:
SM_ACTION_CACHE_LOOKUP -> SM_ACTION_DNS_LOOKUP
116052 [Feb 23 10:19:47.408] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:3986
(do_hostdb_lookup)> (http_seq) [HttpSM::do_hostdb_lookup] Doing DNS Lookup
116053 [Feb 23 10:19:47.409] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:8246 (ink_cluster_time)> (http_trans) [ink_cluster_time]
local: 1456193987, highest_delta: 0, cluster: 1456193987
116077 [Feb 23 10:19:47.423] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:4559
(do_http_server_open)> (http_seq) [HttpSM::do_http_server_open] Sending request
to server
116078 [Feb 23 10:19:47.424] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:4775
(do_http_server_open)> (http) calling netProcessor.connect_re
116079 [Feb 23 10:19:47.427] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2532
(main_handler)> (http) [24146] [HttpSM::main_handler, NET_EVENT_OPEN]
116080 [Feb 23 10:19:47.427] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:1606
(state_http_server_open)> (http_track) entered inside state_http_server_open
116081 [Feb 23 10:19:47.430] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:1607
(state_http_server_open)> (http) [24146] [&HttpSM::state_http_server_open,
NET_EVENT_OPEN]
116082 [Feb 23 10:19:47.430] Server {0x2b1b75e1d700} DEBUG:
<HttpServerSession.cc:91 (new_connection)> (http_ss) [37085] session born,
netvc 0x2b1ded0e44c0
117130 [Feb 23 10:19:48.438] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2532
(main_handler)> (http) [24146] [HttpSM::main_handler, VC_EVENT_WRITE_COMPLETE]
117131 [Feb 23 10:19:48.438] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:1872
(state_send_server_request_header)> (http) [24146]
[&HttpSM::state_send_server_request_header, VC_EVENT_WRITE_COMPLETE]
117408 [Feb 23 10:19:48.640] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2532
(main_handler)> (http) [24146] [HttpSM::main_handler, VC_EVENT_READ_READY]
117409 [Feb 23 10:19:48.642] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:1703
(state_read_server_response_header)> (http) [24146]
[&HttpSM::state_read_server_response_header, VC_EVENT_READ_READY]
117410 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:1816
(state_read_server_response_header)> (http_seq) Done parsing server response
header
117411 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1669 (deallocate_redirect_postdata_buffers)> (http_redirect)
[HttpTunnel::deallocate_postdata_copy_buffers]
117412 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:3222 (HandleResponse)> (http_trans)
[HttpTransact::HandleResponse]
117413 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:3223 (HandleResponse)> (http_seq)
[HttpTransact::HandleResponse] Response received
117414 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:8246 (ink_cluster_time)> (http_trans) [ink_cluster_time]
local: 1456193988, highest_delta: 0, cluster: 1456193988
117415 [Feb 23 10:19:48.643] Server {0x2b1b75e1d700} DEBUG:
<HttpTransact.cc:3230 (HandleResponse)> (http_trans) [HandleResponse]
response_received_time: 1456193988
117451 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:5470
(perform_cache_write_action)> (http) [24146] perform_cache_write_action
CACHE_DO_NO_ACTION
117452 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG: <HttpTunnel.cc:730
(tunnel_run)> (http_tunnel) tunnel_run started, p_arg is provided
117453 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpClientSession.cc:222 (do_io_write)> (http_cs) tcp_init_cwnd_set 0
117454 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpClientSession.cc:234 (set_tcp_init_cwnd)> (http_cs) desired TCP congestion
window is 0
117455 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG: <HttpTunnel.cc:988
(producer_run)> (http_tunnel) [24146] [tunnel_run] producer already done
117456 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1106 (producer_handler)> (http_tunnel) [24146] producer_handler
[http server HTTP_TUNNEL_EVENT_PRECOMPLETE]
117457 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1146 (producer_handler)> (http_redirect)
[HttpTunnel::producer_handler] enable_redirection: [0 0 0] event: 2302
117458 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2846
(tunnel_handler_server)> (http) [24146] [&HttpSM::tunnel_handler_server,
HTTP_TUNNEL_EVENT_PRECOMPLETE]
117459 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpServerSession.cc:161 (release)> (http_ss) Releasing session,
private_session=0, sharing_match=1
117460 [Feb 23 10:19:48.644] Server {0x2b1b75e1d700} DEBUG:
<HttpSessionManager.cc:133 (releaseSession)> (http_ss) [37085] [release
session] session placed into shared pool
117630 [Feb 23 10:19:48.789] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1306 (consumer_handler)> (http_tunnel) [24146] consumer_handler
[user agent VC_EVENT_WRITE_COMPLETE]
117631 [Feb 23 10:19:48.789] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:3086
(tunnel_handler_ua)> (http) [24146] [&HttpSM::tunnel_handler_ua,
VC_EVENT_WRITE_COMPLETE]
117632 [Feb 23 10:19:48.789] Server {0x2b1b75e1d700} DEBUG:
<HttpClientSession.cc:478 (release)> (http_cs) [19752] session released by sm
[24146]
117633 [Feb 23 10:19:48.791] Server {0x2b1b75e1d700} DEBUG:
<HttpClientSession.cc:222 (do_io_write)> (http_cs) tcp_init_cwnd_set 1
117634 [Feb 23 10:19:48.792] Server {0x2b1b75e1d700} DEBUG:
<HttpClientSession.cc:507 (release)> (http_cs) [19752] initiating io for next
header
117635 [Feb 23 10:19:48.793] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2532
(main_handler)> (http) [24146] [HttpSM::main_handler, HTTP_TUNNEL_EVENT_DONE]
117636 [Feb 23 10:19:48.793] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:2785
(tunnel_handler)> (http) [24146] [&HttpSM::tunnel_handler,
HTTP_TUNNEL_EVENT_DONE]
117637 [Feb 23 10:19:48.793] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1669 (deallocate_redirect_postdata_buffers)> (http_redirect)
[HttpTunnel::deallocate_postdata_copy_buffers]
117638 [Feb 23 10:19:48.793] Server {0x2b1b75e1d700} DEBUG:
<HttpTunnel.cc:1669 (deallocate_redirect_postdata_buffers)> (http_redirect)
[HttpTunnel::deallocate_postdata_copy_buffers]
117639 [Feb 23 10:19:48.795] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:6564
(kill_this)> (http_seq) [HttpSM::update_stats] Logging transaction
117640 [Feb 23 10:19:48.796] Server {0x2b1b75e1d700} DEBUG: <HttpSM.cc:6591
(kill_this)> (http) [24146] deallocating sm
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
I wonder what chokes the above process flow? I have no clue for this, just see
them happen over and over again
> ATS choke frequently in production host
> ---------------------------------------
>
> Key: TS-4215
> URL: https://issues.apache.org/jira/browse/TS-4215
> Project: Traffic Server
> Issue Type: Bug
> Components: Core
> Affects Versions: 5.3.1
> Reporter: taoyunxing
> Fix For: 6.1.2
>
>
> During this period, I am suffering from great pressures on my job, because
> ATS chokes frequently in production environment, the QoS is not guaranteed. I
> try many ats version, such as 5.3.x, 6.0.0, 6.1.1, etc, but all have the same
> one.
> HardWare:
> CentOS 6.3/6.6/6.7, CPU 8Cores, Memory 64GB, HardDisk: 300GB sys + 24TB data,
> ATS records.config:
> CONFIG proxy.config.cache.min_average_object_size INT 1048576
> CONFIG proxy.config.cache.ram_cache.algorithm INT 1
> CONFIG proxy.config.cache.ram_cache_cutoff INT 4194304
> CONFIG proxy.config.cache.ram_cache.size INT 32G
> Scenario:
> curl -vo /dev/null -x 127.0.0.1:8081 -H 'X-Debug:
> via,x-cache,x-cache-key,x-milestones' 'http://www.taobao.com/'
> * About to connect() to proxy 127.0.0.1 port 8081 (#0)
> * Trying 127.0.0.1... connected
> * Connected to 127.0.0.1 (127.0.0.1) port 8081 (#0)
> > GET http://www.taobao.com/ HTTP/1.1
> > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1
> > Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> > Host: www.taobao.com
> > Accept: */*
> > Proxy-Connection: Keep-Alive
> > X-Debug: via,x-cache,x-cache-key,x-milestones
> >
> % Total % Received % Xferd Average Speed Time Time Time
> Current
> Dload Upload Total Spent Left Speed
> 0 0 0 0 0 0 0 0 --:--:-- 0:00:15 --:--:--
> 0< HTTP/1.1 302 Found
> < Server: ats/1.5.0
> < Date: Thu, 18 Feb 2016 05:53:59 GMT
> < Content-Type: text/html
> < Content-Length: 258
> < Location: https://www.taobao.com/
> < Set-Cookie: thw=cn; Path=/; Domain=.taobao.com; Expires=Fri, 17-Feb-17
> 05:53:59 GMT;
> < Strict-Transport-Security: max-age=31536000
> < Age: 0
> < Proxy-Connection: keep-alive
> < Via: http/1.1 ltgj51 (ats [uScMsSf pSeN:t cCMi p sS])
> < X-Cache-Key: http://www.taobao.com/
> < X-Cache: miss
> < X-Milestones: DNS-LOOKUP-END=0.000064880, DNS-LOOKUP-BEGIN=0.000064880,
> CACHE-OPEN-WRITE-END=0.000084858, CACHE-OPEN-WRITE-BEGIN=0.000077224,
> CACHE-OPEN-READ-END=0.000064880, CACHE-OPEN-READ-BEGIN=0.000029354,
> SERVER-READ-HEADER-DONE=0.053633813, SERVER-FIRST-READ=0.053633813,
> SERVER-BEGIN-WRITE=0.000115707, SERVER-CONNECT-END=0.000115707,
> SERVER-CONNECT=0.000087664, SERVER-FIRST-CONNECT=0.000087664,
> UA-BEGIN-WRITE=0.053633813, UA-READ-HEADER-DONE=0.000029354,
> UA-BEGIN=0.000000000
> <
> { [data not shown]
> 129 258 129 258 0 0 17 0 0:00:15 0:00:15 --:--:--
> 4777* Connection #0 to host 127.0.0.1 left intact
> * Closing connection #0
> Problem:
> ats may choke for some seconds before the server response returned,ten times
> maybe occur one time,and I don't know why ?
> here, I use X-Debug plugin to show each milestone(in second) of current http
> request
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)