Hello, I am currently conducting a proof-of-concept to explore the value of content caching for a system that automatically fetches a large number of external web pages. I am mainly interested in using TS as a forward proxy and then serving content locally for subsequent duplicate queries to the sites. Currently I have the forward proxy enabled, as well as the reverse proxy setting. Is it necessary or advisable to have both of these enabled if my interest is mainly external content?
Also, I have been getting metrics using traffic_line and running tests using the web UI and see some odd behavior. For instance, whenever I run tests against BestBuy.com, on the initial run (through about 3k pages) there are nearly the same number of writes to the cache. On following runs (using the same pages) no new writes are made to the cache, leading me to believe that the pages already exist, but according to traffic_line, there are no cache hits during the execution period. Other sites appear to perform as expected, so I expected that this was due to dynamic content. However, when I access pages through a browser with a clear cache, I see certain pages failing to be added to the cache, while what I would consider collateral content is added. For instance, if I access http://www.msn.comand inspect the cache through the web UI, I find many cached items from the MSN domain, but nothing containing the actual page content. Is this expected behavior and something configurable, or am I missing some fundamental aspect of the cache? Thanks a lot, Zachary Miller
