There are lot of stats around cache. The one that means that it didn't have to contact the origin server at all is the first one:
proxy.process.http.cache_hit_fresh
proxy.process.http.cache_hit_revalidated
proxy.process.http.cache_hit_ims
proxy.process.http.cache_hit_stale_served
proxy.process.http.cache_miss_cold
proxy.process.http.cache_miss_changed
proxy.process.http.cache_miss_not_cacheable
proxy.process.http.cache_miss_client_no_cache
proxy.process.http.cache_miss_ims

If you are looking at the stat below it is possible it is a cache hit, but just from disk and not ram:
proxy.process.cache.ram_cache.hits

-Bryan

On 03/26/2010 01:12 PM, Zachary Miller wrote:
Yeah, I just see cache writes on the initial run only, which leads me to believe that the content is being cached, and later runs don't produce hits (based on the hit rate from the traffic_line).

On Fri, Mar 26, 2010 at 12:09 PM, Bryan Call <[email protected] <mailto:[email protected]>> wrote:

    On 03/26/2010 11:33 AM, Zachary Miller wrote:

        Hello,

        I am currently conducting a proof-of-concept to explore the
        value of content caching for a system that automatically
        fetches a large number of external web pages.  I am mainly
        interested in using TS as a forward proxy and then serving
        content locally for subsequent duplicate queries to the sites.
         Currently I have the forward proxy enabled, as well as the
        reverse proxy setting.  Is it necessary or advisable to have
        both of these enabled if my interest is mainly external content?


    I run traffic server as both a reverse and forward proxy without a
    problem.



        Also, I have been getting metrics using traffic_line and
        running tests using the web UI and see some odd behavior.  For
        instance, whenever I run tests against BestBuy.com, on the
        initial run (through about 3k pages) there are nearly the same
        number of writes to the cache.  On following runs (using the
        same pages) no new writes are made to the cache, leading me to
        believe that the pages already exist, but according to
        traffic_line, there are no cache hits during the execution
        period.  Other sites appear to perform as expected, so I
        expected that this was due to dynamic content.

        However, when I access pages through a browser with a clear
        cache, I see certain pages failing to be added to the cache,
        while what I would consider collateral content is added.  For
        instance, if I access http://www.msn.com and inspect the cache
        through the web UI, I find many cached items from the MSN
        domain, but nothing containing the actual page content.  Is
        this expected behavior and something configurable, or am I
        missing some fundamental aspect of the cache?


    If the page has dynamic content and the origin server (best buy /
    msn) has the headers not to cache the content then it won't be
    cached by default.  You can override this and cache the content,
    we do this for some of our crawling.

    I am not very familiar with the Web UI and I use the command line
    tools.  Are you seeing cache writes  on Best Buy on the following
    runs or just the first crawl?



        Thanks a lot,

        Zachary Miller





Reply via email to