Re: The cache deny QUERY change... partial rollback?
On Mon, 2008-12-01 at 15:34 +0100, Henrik Nordstrom wrote: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. Taking all the responses on this thread into account, it seems like what you are proposing should be done as a temporary solution to the problem. Does this affect both Squid versions? Thank you, Alex.
The cache deny QUERY change... partial rollback?
After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: The cache deny QUERY change... partial rollback?
2008/12/1 Henrik Nordstrom [EMAIL PROTECTED]: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? Are you able to put up some examples and statistics? I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Adrian
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: The cache deny QUERY change... partial rollback?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Henrik Nordstrom wrote: After analyzing a large cache with significantly declining hit ratio over the last months I have came to the conclusion that the removal of cache deny QUERY can have a very negative impact on hit ratio, this due to a number of flash video sites (youtube, google, various porno sites etc) who include per-view unique query parameters in the URL and responding with a cachable response. Because of this I suggest that we add back the cache deny rule in the recommended config, but leave the refresh_pattern change as-is. People running reverse proxies or combating these cache busting sites using store rewrites know how to change the cache rules, while many users running general proxy servers are quite negatively impacted by these sites if caching of query urls is allowed. Having a single recommended config seems dubious: I for one never run squid as a forward proxy, for instance. We should probably split apart the default / recommended forward and reverse configurations (which are just starting points, right?) and document how to tell which one to start with. Tres. - -- === Tres Seaver +1 540-429-0999 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJNAo0+gerLs4ltQ4RAnlrAJ45FgRi1WjkyikSunADePZSOwwBTgCghz+E 9fOaumxljVn99Tm257N1rUw= =Q9De -END PGP SIGNATURE-
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 11:00 -0500 skrev Tres Seaver: Having a single recommended config seems dubious: I for one never run squid as a forward proxy, for instance. We should probably split apart the default / recommended forward and reverse configurations (which are just starting points, right?) and document how to tell which one to start with. The example/default configuration shipped with Squid is that of a normal proxy. Reverse proxies do need some changes to that config, it's unavoidable. Also, if your site is sane then you use query parameters in a sane manner and this while discussion is irrelevant. Regards Henrik
Re: The cache deny QUERY change... partial rollback?
mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik A global blockade is a little harsh when it's only a few offenders. If we can locate a pattern to match just these sites while any dialog is going on I'd be happy to support a reversal for just them. That would keep most of the main bandwidth gains from doing it in the first place. Amos
Re: The cache deny QUERY change... partial rollback?
tis 2008-12-02 klockan 12:35 +1300 skrev Amos Jeffries: A global blockade is a little harsh when it's only a few offenders. If we can locate a pattern to match just these sites while any dialog is going on I'd be happy to support a reversal for just them. That would keep most of the main bandwidth gains from doing it in the first place. In the analyzed cache there were no identified query objects 10 MB without session identifiers in the query parameters. These objects came from a wide range of sites. With some being more prominent than others. The majority were flash videos. But not all. There was also software downloads, and some other data. Among the flash video sites, there were about 3 different styles in how the query parameters were encoded, suggesting that there is about as many providers of the software used, or may be related to CDN networks (not sure as it's impossible to tell from URL alone). Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: The cache deny QUERY change... partial rollback?
Hmm. Given that heap GDSF out-performs LRU in the common case, and there's a crashing bug in LRU at the moment anyway, maybe the best thing to do is to change the default replacement policy -- and always compile in the heap algorithms? On 02/12/2008, at 2:05 AM, Henrik Nordstrom wrote: mån 2008-12-01 klockan 09:40 -0500 skrev Adrian Chadd: Hm, thats kind of interesting actually. Whats it displacing from the cache? Is the drop of hit ratio due to the removal of other cachable large objects, or other cachable small objects? Is it -just- flash video thats exhibiting this behaviour? The studied cache is using LRU, and these flash videos effectively reduce the cache size by filling the cache with large and never to be referenced again objects. Are you able to put up some examples and statistics? I'll try. I really think the right thing to do here is look at what various sites are doing and try to open a dialogue with them. Chances are they don't really know exactly how to (ab)use HTTP to get the semantics they want whilst retaining control over their content. Probably true. Based on the URLs styles there seem to only be two or three of these authentication/session schemes. Regards Henrik -- Mark Nottingham [EMAIL PROTECTED]