On 07/03/2011 14:58, Lane, Richard wrote:
I am looking into supporting Google’s “First Click Free for Web
Search”. I need to allow the GoogleBots to index the full content of
my sites but still maintain the Registration wall for everyone else.
Google suggests that you detect there GoogleBots by reverse DNS lookup
of the requesters IP.
Google Desc:
http://www.google.com/support/webmasters/bin/answer.py?answer=80553
Has anyone done DNS lookups via VCL to verify access to content or to
cache content?
I believe this /could/ be done using a C function, but it's not
something I've had experience of before.
What you could do is detect the Google user-agent in varnish, and then
pass that and the IP to a backend script with the original request: such as
/* Varnish 2.0.6 psuedo code - may need updating */
if (req.http.user-agent == "Googlebot") {
set.http.x-varnish-originalurl = req.url;
set req.url = "/googlecheck?ip= " client.ip "&originalurl=" req.url;
lookup;
}
and the Googlecheck script actually does the rDNS look up and if it
matches, it returns the contents of the requested url.
Richard Chiswell
http://www.mangahigh.com
(Speaking personally yadda yadda)
_______________________________________________
varnish-misc mailing list
[email protected]
http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc