Re: ATS slow after ERR_CONNECT_FAIL
exit any private browsing On Tue, Mar 6, 2018 at 5:14 AM, Veiko Kukkwrote: > Hi again, > > This list seems really quiet. > Meanwhile, I've enabled slow log which brings in strange results that I > can't explain. Maybe somebody else can explain it. > > status: 200 unique id: redirection_tries: 0 bytes: 20 fd: 0 client state: > 0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: > 0.000 cache_open_read_begin: 0.006 cache_open_read_end: 0.920 > dns_lookup_begin: 0.920 dns_lookup_end: 0.920 server_connect: 4.204 > server_connect_end: 4.204 server_first_read: 43.955 > server_read_header_done: 43.955 server_close: 43.957 ua_write: 43.957 > ua_close: 43.957 sm_finish: 43.957 plugin_active: 0.000 plugin_total: 0.000 > status: 200 unique id: redirection_tries: 0 bytes: 20 fd: 0 client state: > 0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: > 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 > dns_lookup_begin: 0.000 dns_lookup_end: 0.000 server_connect: 2.384 > server_connect_end: 2.384 server_first_read: 76.550 > server_read_header_done: 76.550 server_close: 76.553 ua_write: 76.553 > ua_close: 76.553 sm_finish: 76.553 plugin_active: 0.000 plugin_total: 0.000 > > 1) Why there is cache read when cache.config has url_regex=that_url_regex > action=never-cache If the url matches never cache action, why is it looked > up at all in the beginning? > 2) What's going on between dns_lookup_end and server_connect? Where is the > time spent? What timeout setting should apply to server_connect? Shouldn't > it be proxy.config.http.connect_attempts_timeout which is set to 2 > seconds. > 3) What timeout applies to time between server_connect_end > and server_first_read? Isn't it > proxy.config.http.transaction_no_activity_timeout_out > which on our case is set to 6 seconds. > > Current config timeouts: > > proxy.config.http.transaction_no_activity_timeout_out: 6 > proxy.config.http.connect_attempts_timeout: 2 > proxy.config.lm.pserver_timeout_secs: 1 > proxy.config.lm.pserver_timeout_msecs: 0 > proxy.config.cluster.peer_timeout: 30 > proxy.config.cluster.mc_poll_timeout: 5 > proxy.config.cluster.startup_timeout: 10 > proxy.config.process_manager.timeout: 5 > proxy.config.vmap.down_up_timeout: 10 > proxy.config.http.congestion_control.default.live_os_conn_timeout: 60 > proxy.config.http.congestion_control.default.dead_os_conn_timeout: 15 > proxy.config.http.parent_proxy.connect_attempts_timeout: 30 > proxy.config.http.keep_alive_no_activity_timeout_in: 120 > proxy.config.http.keep_alive_no_activity_timeout_out: 120 > proxy.config.websocket.no_activity_timeout: 600 > proxy.config.websocket.active_timeout: 3600 > proxy.config.http.transaction_no_activity_timeout_in: 30 > proxy.config.http.transaction_active_timeout_in: 900 > proxy.config.http.transaction_active_timeout_out: 0 > proxy.config.http.accept_no_activity_timeout: 120 > proxy.config.http.background_fill_active_timeout: 0 > proxy.config.http.post_connect_attempts_timeout: 1800 > proxy.config.socks.socks_timeout: 100 > proxy.config.socks.server_connect_timeout: 10 > proxy.config.socks.server_retry_timeout: 300 > proxy.config.net.poll_timeout: 10 > proxy.config.net.default_inactivity_timeout: 86400 > proxy.config.dns.lookup_timeout: 20 > proxy.config.hostdb.lookup_timeout: 30 > proxy.config.hostdb.timeout: 86400 > proxy.config.hostdb.fail.timeout: 0 > proxy.config.log.collation_host_timeout: 86390 > proxy.config.log.collation_client_timeout: 86400 > proxy.config.ssl.session_cache.timeout: 0 > proxy.config.ssl.handshake_timeout_in: 0 > proxy.config.ssl.ocsp.cache_timeout: 3600 > proxy.config.ssl.ocsp.request_timeout: 10 > proxy.config.http2.accept_no_activity_timeout: 120 > proxy.config.http2.no_activity_timeout_in: 120 > proxy.config.http2.active_timeout_in: 900 > > Best regards, > Veiko > > > 2018-03-02 17:18 GMT+02:00 Veiko Kukk : > >> Hi, >> >> ATS 7.1.2, CentOS 7.4. >> >> We have regular requests via ATS in reverse mode to get response time >> statistics and decide whether source is available. This request is excluded >> from caching, so it always results in TCP_MISS/200 in normal conditions >> with response time fluctuation depending on the distance of ATS server from >> source. >> >> Sometimes there are network errors that get logged >> as ERR_CONNECT_FAIL/502, after which there are again TCP_MISS/200 but with >> very high latency! >> This happens on all ATS servers we have deployed. >> >> For example, when normally before network error response times fluctuate >> around 150ms, then after they fluctuate around 1300 ms. After restarting >> ATS, response times are back to normal. >> >> Response times from log: >> 168 >> 12979 >> 2755 >> 117256 >> 56442 >> 40104 >> 1043 >> 30955 >> 30972 >> 18418 >> 26542 >> 4172 >> 8587 >> 59259 >> 4674 >> 37166 >> 16123 >> 67019 >> 41723 >> 3497 >> 6957 >> 18684 >> 17663 >> 14634 >> 20036 >> 1305 >> 14815 >> 3526 >> 10542 >>
Re: ATS slow after ERR_CONNECT_FAIL
Hi again, This list seems really quiet. Meanwhile, I've enabled slow log which brings in strange results that I can't explain. Maybe somebody else can explain it. status: 200 unique id: redirection_tries: 0 bytes: 20 fd: 0 client state: 0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.006 cache_open_read_end: 0.920 dns_lookup_begin: 0.920 dns_lookup_end: 0.920 server_connect: 4.204 server_connect_end: 4.204 server_first_read: 43.955 server_read_header_done: 43.955 server_close: 43.957 ua_write: 43.957 ua_close: 43.957 sm_finish: 43.957 plugin_active: 0.000 plugin_total: 0.000 status: 200 unique id: redirection_tries: 0 bytes: 20 fd: 0 client state: 0 server state: 9 ua_begin: 0.000 ua_first_read: 0.000 ua_read_header_done: 0.000 cache_open_read_begin: 0.000 cache_open_read_end: 0.000 dns_lookup_begin: 0.000 dns_lookup_end: 0.000 server_connect: 2.384 server_connect_end: 2.384 server_first_read: 76.550 server_read_header_done: 76.550 server_close: 76.553 ua_write: 76.553 ua_close: 76.553 sm_finish: 76.553 plugin_active: 0.000 plugin_total: 0.000 1) Why there is cache read when cache.config has url_regex=that_url_regex action=never-cache If the url matches never cache action, why is it looked up at all in the beginning? 2) What's going on between dns_lookup_end and server_connect? Where is the time spent? What timeout setting should apply to server_connect? Shouldn't it be proxy.config.http.connect_attempts_timeout which is set to 2 seconds. 3) What timeout applies to time between server_connect_end and server_first_read? Isn't it proxy.config.http.transaction_no_activity_timeout_out which on our case is set to 6 seconds. Current config timeouts: proxy.config.http.transaction_no_activity_timeout_out: 6 proxy.config.http.connect_attempts_timeout: 2 proxy.config.lm.pserver_timeout_secs: 1 proxy.config.lm.pserver_timeout_msecs: 0 proxy.config.cluster.peer_timeout: 30 proxy.config.cluster.mc_poll_timeout: 5 proxy.config.cluster.startup_timeout: 10 proxy.config.process_manager.timeout: 5 proxy.config.vmap.down_up_timeout: 10 proxy.config.http.congestion_control.default.live_os_conn_timeout: 60 proxy.config.http.congestion_control.default.dead_os_conn_timeout: 15 proxy.config.http.parent_proxy.connect_attempts_timeout: 30 proxy.config.http.keep_alive_no_activity_timeout_in: 120 proxy.config.http.keep_alive_no_activity_timeout_out: 120 proxy.config.websocket.no_activity_timeout: 600 proxy.config.websocket.active_timeout: 3600 proxy.config.http.transaction_no_activity_timeout_in: 30 proxy.config.http.transaction_active_timeout_in: 900 proxy.config.http.transaction_active_timeout_out: 0 proxy.config.http.accept_no_activity_timeout: 120 proxy.config.http.background_fill_active_timeout: 0 proxy.config.http.post_connect_attempts_timeout: 1800 proxy.config.socks.socks_timeout: 100 proxy.config.socks.server_connect_timeout: 10 proxy.config.socks.server_retry_timeout: 300 proxy.config.net.poll_timeout: 10 proxy.config.net.default_inactivity_timeout: 86400 proxy.config.dns.lookup_timeout: 20 proxy.config.hostdb.lookup_timeout: 30 proxy.config.hostdb.timeout: 86400 proxy.config.hostdb.fail.timeout: 0 proxy.config.log.collation_host_timeout: 86390 proxy.config.log.collation_client_timeout: 86400 proxy.config.ssl.session_cache.timeout: 0 proxy.config.ssl.handshake_timeout_in: 0 proxy.config.ssl.ocsp.cache_timeout: 3600 proxy.config.ssl.ocsp.request_timeout: 10 proxy.config.http2.accept_no_activity_timeout: 120 proxy.config.http2.no_activity_timeout_in: 120 proxy.config.http2.active_timeout_in: 900 Best regards, Veiko 2018-03-02 17:18 GMT+02:00 Veiko Kukk: > Hi, > > ATS 7.1.2, CentOS 7.4. > > We have regular requests via ATS in reverse mode to get response time > statistics and decide whether source is available. This request is excluded > from caching, so it always results in TCP_MISS/200 in normal conditions > with response time fluctuation depending on the distance of ATS server from > source. > > Sometimes there are network errors that get logged > as ERR_CONNECT_FAIL/502, after which there are again TCP_MISS/200 but with > very high latency! > This happens on all ATS servers we have deployed. > > For example, when normally before network error response times fluctuate > around 150ms, then after they fluctuate around 1300 ms. After restarting > ATS, response times are back to normal. > > Response times from log: > 168 > 12979 > 2755 > 117256 > 56442 > 40104 > 1043 > 30955 > 30972 > 18418 > 26542 > 4172 > 8587 > 59259 > 4674 > 37166 > 16123 > 67019 > 41723 > 3497 > 6957 > 18684 > 17663 > 14634 > 20036 > 1305 > 14815 > 3526 > 10542 > 62519 > 22025 > 40556 > 36821 > 2342 > 2644 > 3695 > 7059 > 45581 > 1706 > 30947 > 16001 > 136383 > 1345 > 72770 > 1087 > 1311 > 1074 > 1080 > 1101 > 1033 > 1045 > 1057 > 1109 > 1078 > 1104 > 1059 > 1066 > 1062 > 1143 > 1074 > 1095 > 1207 > 1059 > 1067 > 1101 > 1110 > 1069 > 1394 > 1278 > 1323 > 1319 >
ATS slow after ERR_CONNECT_FAIL
Hi, ATS 7.1.2, CentOS 7.4. We have regular requests via ATS in reverse mode to get response time statistics and decide whether source is available. This request is excluded from caching, so it always results in TCP_MISS/200 in normal conditions with response time fluctuation depending on the distance of ATS server from source. Sometimes there are network errors that get logged as ERR_CONNECT_FAIL/502, after which there are again TCP_MISS/200 but with very high latency! This happens on all ATS servers we have deployed. For example, when normally before network error response times fluctuate around 150ms, then after they fluctuate around 1300 ms. After restarting ATS, response times are back to normal. Response times from log: 168 12979 2755 117256 56442 40104 1043 30955 30972 18418 26542 4172 8587 59259 4674 37166 16123 67019 41723 3497 6957 18684 17663 14634 20036 1305 14815 3526 10542 62519 22025 40556 36821 2342 2644 3695 7059 45581 1706 30947 16001 136383 1345 72770 1087 1311 1074 1080 1101 1033 1045 1057 1109 1078 1104 1059 1066 1062 1143 1074 1095 1207 1059 1067 1101 1110 1069 1394 1278 1323 1319 1315 1317 1317 1343 1270 1284 1362 1446 1363 1302 1321 1330 1317 1335 1346 1312 and after restart: 593 362 129 381 361 380 394 394 117 384 341 112 88 363 397 93 385 356 371 360 411 120 338 497 449 377 131 458 161 372 112 364 87 129 340 93 114 338 377 454 424 381 381 348 129 336 116 404 108 340 114 Best regards, Veiko