I have been digging around the documentation and wiki to see if this has been 
done before, it seems not so it might just be a bad idea...

I'm working on a site that has a large number of dynamic pages. Googlebot is 
going to town spidering everything in sight and we need to get it under control 
in the short-term while we address the underlying performance.

The content on the pages needs to be displayed to humans with a short cache 
time, but for Googlebot we wouldn't mind caching much more aggressively.

So my thought was to manage the cache such that if anyone other than googlebot 
requested a page that we process it normally with a reasonable TTL and update 
the cache. But if Googlebot requests a page, determined by the agent string, we 
try to serve the page from the cache if it's available (even if it's stale) and 
otherwise fetch from the backend and update as normal.

Aside from this maybe being a bad idea, I'm not sure how efficiently this could 
be implemented with Varnish. The reason for trying to handle all this in 
Varnish is that we can't easily make changes to the underlying CMS to handle 
this.

Is this a good or bad idea? And at what point in the varnish pipeline is it 
most efficient to handle this?



-- 
David M Turner <[email protected]>
Collaborative Business Services Ltd
_______________________________________________
varnish-misc mailing list
[email protected]
http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc

Reply via email to