This is great.  Thanks.  I must have missed these commands somehow.  Does
that mean that the cache rate (*proxy.node.cache_hit_ratio_avg_10s*) refers
to the ram cache?

On Fri, Mar 26, 2010 at 1:33 PM, Bryan Call <[email protected]> wrote:

>  There are lot of stats around cache.  The one that means that it didn't
> have to contact the origin server at all is the first one:
> proxy.process.http.cache_hit_fresh
> proxy.process.http.cache_hit_revalidated
> proxy.process.http.cache_hit_ims
> proxy.process.http.cache_hit_stale_served
> proxy.process.http.cache_miss_cold
> proxy.process.http.cache_miss_changed
> proxy.process.http.cache_miss_not_cacheable
> proxy.process.http.cache_miss_client_no_cache
> proxy.process.http.cache_miss_ims
>
> If you are looking at the stat below it is possible it is a cache hit, but
> just from disk and not ram:
> proxy.process.cache.ram_cache.hits
>
> -Bryan
>
>
> On 03/26/2010 01:12 PM, Zachary Miller wrote:
>
> Yeah, I just see cache writes on the initial run only, which leads me to
> believe that the content is being cached, and later runs don't produce hits
> (based on the hit rate from the traffic_line).
>
> On Fri, Mar 26, 2010 at 12:09 PM, Bryan Call <[email protected]> wrote:
>
>> On 03/26/2010 11:33 AM, Zachary Miller wrote:
>>
>>> Hello,
>>>
>>> I am currently conducting a proof-of-concept to explore the value of
>>> content caching for a system that automatically fetches a large number of
>>> external web pages.  I am mainly interested in using TS as a forward proxy
>>> and then serving content locally for subsequent duplicate queries to the
>>> sites.  Currently I have the forward proxy enabled, as well as the reverse
>>> proxy setting.  Is it necessary or advisable to have both of these enabled
>>> if my interest is mainly external content?
>>>
>>
>>  I run traffic server as both a reverse and forward proxy without a
>> problem.
>>
>>
>>
>>> Also, I have been getting metrics using traffic_line and running tests
>>> using the web UI and see some odd behavior.  For instance, whenever I run
>>> tests against BestBuy.com, on the initial run (through about 3k pages) there
>>> are nearly the same number of writes to the cache.  On following runs (using
>>> the same pages) no new writes are made to the cache, leading me to believe
>>> that the pages already exist, but according to traffic_line, there are no
>>> cache hits during the execution period.  Other sites appear to perform as
>>> expected, so I expected that this was due to dynamic content.
>>>
>>> However, when I access pages through a browser with a clear cache, I see
>>> certain pages failing to be added to the cache, while what I would consider
>>> collateral content is added.  For instance, if I access
>>> http://www.msn.com and inspect the cache through the web UI, I find many
>>> cached items from the MSN domain, but nothing containing the actual page
>>> content.  Is this expected behavior and something configurable, or am I
>>> missing some fundamental aspect of the cache?
>>>
>>
>>  If the page has dynamic content and the origin server (best buy / msn)
>> has the headers not to cache the content then it won't be cached by default.
>>  You can override this and cache the content, we do this for some of our
>> crawling.
>>
>> I am not very familiar with the Web UI and I use the command line tools.
>>  Are you seeing cache writes  on Best Buy on the following runs or just the
>> first crawl?
>>
>>
>>
>>> Thanks a lot,
>>>
>>> Zachary Miller
>>>
>>
>>
>>
>
>

Reply via email to