This is great. Thanks. I must have missed these commands somehow. Does that mean that the cache rate (*proxy.node.cache_hit_ratio_avg_10s*) refers to the ram cache?
On Fri, Mar 26, 2010 at 1:33 PM, Bryan Call <[email protected]> wrote: > There are lot of stats around cache. The one that means that it didn't > have to contact the origin server at all is the first one: > proxy.process.http.cache_hit_fresh > proxy.process.http.cache_hit_revalidated > proxy.process.http.cache_hit_ims > proxy.process.http.cache_hit_stale_served > proxy.process.http.cache_miss_cold > proxy.process.http.cache_miss_changed > proxy.process.http.cache_miss_not_cacheable > proxy.process.http.cache_miss_client_no_cache > proxy.process.http.cache_miss_ims > > If you are looking at the stat below it is possible it is a cache hit, but > just from disk and not ram: > proxy.process.cache.ram_cache.hits > > -Bryan > > > On 03/26/2010 01:12 PM, Zachary Miller wrote: > > Yeah, I just see cache writes on the initial run only, which leads me to > believe that the content is being cached, and later runs don't produce hits > (based on the hit rate from the traffic_line). > > On Fri, Mar 26, 2010 at 12:09 PM, Bryan Call <[email protected]> wrote: > >> On 03/26/2010 11:33 AM, Zachary Miller wrote: >> >>> Hello, >>> >>> I am currently conducting a proof-of-concept to explore the value of >>> content caching for a system that automatically fetches a large number of >>> external web pages. I am mainly interested in using TS as a forward proxy >>> and then serving content locally for subsequent duplicate queries to the >>> sites. Currently I have the forward proxy enabled, as well as the reverse >>> proxy setting. Is it necessary or advisable to have both of these enabled >>> if my interest is mainly external content? >>> >> >> I run traffic server as both a reverse and forward proxy without a >> problem. >> >> >> >>> Also, I have been getting metrics using traffic_line and running tests >>> using the web UI and see some odd behavior. For instance, whenever I run >>> tests against BestBuy.com, on the initial run (through about 3k pages) there >>> are nearly the same number of writes to the cache. On following runs (using >>> the same pages) no new writes are made to the cache, leading me to believe >>> that the pages already exist, but according to traffic_line, there are no >>> cache hits during the execution period. Other sites appear to perform as >>> expected, so I expected that this was due to dynamic content. >>> >>> However, when I access pages through a browser with a clear cache, I see >>> certain pages failing to be added to the cache, while what I would consider >>> collateral content is added. For instance, if I access >>> http://www.msn.com and inspect the cache through the web UI, I find many >>> cached items from the MSN domain, but nothing containing the actual page >>> content. Is this expected behavior and something configurable, or am I >>> missing some fundamental aspect of the cache? >>> >> >> If the page has dynamic content and the origin server (best buy / msn) >> has the headers not to cache the content then it won't be cached by default. >> You can override this and cache the content, we do this for some of our >> crawling. >> >> I am not very familiar with the Web UI and I use the command line tools. >> Are you seeing cache writes on Best Buy on the following runs or just the >> first crawl? >> >> >> >>> Thanks a lot, >>> >>> Zachary Miller >>> >> >> >> > >
