after review and debug the problem, i found that problem is cause by thunder herd when cache stale/cache miss~ ( as yongming saied) that's why cache miss happen sbudently but not always (but once happen, it last long), cache miss make the original servers more busier, and all these cache miss request will try to haved it cached after finished fetching... during the cache updating(frequently)... the subcequent requests also will be cached miss and go to the original, which make the orginal more buiser (make the stituation worse....)
that 's why i got some request will be cache miss(more that 90%), but still a small chance that will get cached for several seconds (during cached updated interval).... the more original server busier, the more cache miss.... on more question, event i enabled read while writer and stale while revalidate, some request still cause the thunder heard (small chance)... is i missing something ? 在 2016-10-12 09:59:17,"Esmq" <[email protected]> 写道: thanks for all replies. 1) purge method do not help 2) i tring stale while revalidate and read-while-writer at the same time now, and i found that works fine for me. traffic_line -s proxy.config.http.cache.max_open_read_retries -v 3 traffic_line -s proxy.config.http.cache.open_read_retry_time -v 50 traffic_line -s proxy.config.http.cache.max_open_write_retries -v 3 traffic_line -s proxy.config.http.cache.open_write_fail_action -v 2 traffic_line -s proxy.config.cache.max_doc_size -v 0 traffic_line -s proxy.config.cache.enable_read_while_writer -v 1 traffic_line -s proxy.config.cache.read_while_writer.max_retries -v 5 traffic_line -s proxy.config.cache.read_while_writer.delay -v 100 traffic_line -s proxy.config.cache.read_while_writer_retry.delay -v 50 在 2016-10-11 11:44:36,"Yongming Zhao" <[email protected]> 写道: please check if the stale caching works as expected, for example can you find out proxy.node.http.cache_hit_stale_served_avg_10s results? in you case, the very busy URL need to revalidate, it will have to deal with two solutions here: 1, serve with the stale, and revalidate it 2, revalidate updating and hold all the new requests by read-while-writer stale and revalidate is my prefer, as read-while-writer have to do with the strict locking issue, it is not so perfect with very hot URL, hopes that will help. thanks - Yongming Zhao 赵永明 在 2016年10月11日,上午6:05,Leif Hedstrom <[email protected]> 写道: On Oct 9, 2016, at 1:45 AM, Esmq <[email protected]> wrote: following is traffic access log, subdently run into cache miss (miss rate more than 90%) Couple of questions: 1) Does it help if you purge the URL? (Instead of wiping the entire cache). 2) If you have a box where it reproduces, can you take it out of production, and run traffic_server with a diags tracer on “http|cache” or some such? Fwiw, Miles and I experienced something similar to this once, where a URL got into a state couldn’t get into the cache until we purged it. — Leif --------------------------------------------------------------------------- 1475991415.931 49552 117.170.206.206 ERR_CLIENT_ABORT/200 248791 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.069 209 112.8.22.28 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.151 430 117.136.77.2 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.154 900 117.136.84.246 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.215 790 223.104.227.138 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.255 503 223.68.189.43 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.283 8282 211.138.116.139 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.310 895 112.17.247.164 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.319 355 117.166.186.169 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.319 5246 183.198.61.37 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.391 690 117.136.79.174 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.417 719 117.136.8.67 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.422 820 223.104.177.116 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.439 1156 117.136.83.124 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.460 969 117.136.66.253 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.502 575 117.136.45.216 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.564 796 117.136.81.143 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.665 375 223.104.3.227 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991416.712 760 117.136.63.169 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.794 575 111.19.32.253 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991416.810 655 223.104.38.55 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991416.878 1235 59.63.249.66 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991416.923 571 223.104.227.12 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991416.946 608 59.63.249.65 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.004 1710 223.104.23.66 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.058 10539 117.136.79.164 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.102 470 223.104.90.42 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.122 24329 39.128.124.133 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.195 1454 112.17.245.99 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.199 317 120.199.125.16 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.212 559 117.136.68.2 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.259 475 183.207.217.146 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.268 926 117.136.68.165 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.278 1324 183.240.8.35 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.292 607 218.201.82.97 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.336 619 223.68.189.49 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.373 311 223.96.69.223 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.377 126 117.136.81.106 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.441 513 117.173.217.97 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.499 44481 223.104.9.67 TCP_HIT/200 477480 GET http://37g.update.example.com/test.server.txt - NONE/- text/plain "-" https=1 1475991417.533 395 39.186.146.163 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.546 381 111.151.100.190 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.602 210 223.104.11.106 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.616 640 223.104.18.124 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.657 673 117.191.16.91 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.672 510 112.17.247.33 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.722 1150 223.104.91.96 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.735 281 120.193.236.62 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.827 960 117.136.97.9 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.885 409 117.136.70.25 TCP_MISS/200 477478 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 1475991417.923 829 117.136.81.157 TCP_MISS/200 477472 GET http://37g.update.example.com/test.server.txt - DIRECT/37g.update.example.com text/plain "-" https=1 At 2016-10-09 15:21:16, "Esmq" <[email protected]> wrote: hi,all i encounter a strange cache mis behavior~ i have 30 servers running ats(v6.1.1), it running well for most of the time in the past, but 4 of these servers suddenly suffer from cache miss these days, and i found only one particular request can't be cached (previously cached properly~) the other servers still running well, differences between these 4 servers and others: that these 4 servers's load triple higher than others ( tps around 500-800, bandiwdth around 200mbit/s) info about the request subdenly can't be cached: body size : 500K, tps for this requests is around 60, the only way i can get the response cached is restart ats and clear cache storage, but the problem reoccur agin for some time.. proxy.config.cache.ram_cache.size 104857600 proxy.config.cache.target_fragment_size 16384 proxy.config.cache.max_doc_size 15728640 proxy.config.cache.min_average_object_size 4096 proxy.config.cache.hit_evacuate_size_limit 0 proxy.config.cache.force_sector_size 0 proxy.config.cache.alt_rewrite_max_size 4096 proxy.config.ssl.session_cache.size 102400 i don't known what's the problem , any help will be appreciated.
