On 07/03/2011 14:58, Lane, Richard wrote:

I am looking into supporting Google’s “First Click Free for Web Search”. I need to allow the GoogleBots to index the full content of my sites but still maintain the Registration wall for everyone else. Google suggests that you detect there GoogleBots by reverse DNS lookup of the requesters IP.

Google Desc: http://www.google.com/support/webmasters/bin/answer.py?answer=80553

Has anyone done DNS lookups via VCL to verify access to content or to cache content?

I believe this /could/ be done using a C function, but it's not something I've had experience of before.

What you could do is detect the Google user-agent in varnish, and then pass that and the IP to a backend script with the original request: such as
/* Varnish 2.0.6 psuedo code - may need updating */
if (req.http.user-agent == "Googlebot") {
    set.http.x-varnish-originalurl = req.url;
    set req.url = "/googlecheck?ip= " client.ip "&originalurl=" req.url;
    lookup;
}
and the Googlecheck script actually does the rDNS look up and if it matches, it returns the contents of the requested url.

Richard Chiswell
http://www.mangahigh.com
(Speaking personally yadda yadda)
_______________________________________________
varnish-misc mailing list
[email protected]
http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc

Reply via email to