I believe that apache ProxyPass _will_ send an X-forwarded-for header
for you. But you're right that the "forwarded for" IP address will in
your case be an internal-only IP that doesn't mean anything to google,
if it's there at all. But who knows what Google's 'traffic defender'
routines do, maybe they would realize in the presence of that
x-forwarded-for not to limit you, even though the forwarded-for IP is
meaningless.
Who knows, and google probably won't say (because they don't want to
give any extra info to people maliciously trying to get around it).
Do please keep us updated on if this new solution works and prevents the
traffic-limiting defense that you were getting before. If it does, then
the question would be why, but x-forwarded-for (which I _think_
ProxyPass will send) may indeed be the answer.
Jonathan
Boheemen, Peter van wrote:
I don't think I do anything sophisticated like X-forwarder-for. I just have a
ProxyPass directive in the apache configuration teeling it to reverse proxy a
directory to google
ProxyPass /googlebooks http://books.google.com/books
But what if Google did something with a X-forwarded-for header? It can not see
where the actual user is located. Behind a NAT usually 10.0.0.0 adresses are
used. In fact it is trivial what Ip adresses are used behind the NAT. Since
they are not exposed to the outside world it is only relevant if they are
unique within the network behind the NAT.
Anyway, since we only hit google books form the server when a user asks for
display of a full record, I hardly expect that will cause the Google triggers.
I suspect that the few thousand PC's within the university campus hitting
Google cause the problem, which especially Google books reacts upon. (I can
still search Google when Google books rejects accces from my IP adress.)
I'll keep you informed.
Peter
Drs. P.J.C. van Boheemen
Hoofd Applicatieontwikkeling en beheer - Bibliotheek Wageningen UR
Head of Application Development and Management - Wageningen University and
Research Library
tel. +31 317 48 25 17
http://library.wur.nl <http://library.wur.nl/>
P Please consider the environment before printing this e-mail
________________________________
Van: Code for Libraries namens Jonathan Rochkind
Verzonden: di 18-3-2008 18:48
Aan: CODE4LIB@LISTSERV.ND.EDU
Onderwerp: Re: [CODE4LIB] Restricted access fo free covers from Google :)
Nice. X-Forwarded-For would also allow google to deliver availability
information suitable for the actual location of the end-user. If their
software chooses to pay attention to this. Which is the objection to
server-side API requests voiced to me by a Google person. (By proxying
everything through the server, you are essentially doing what I wanted
to do in the first place but Google told me they would not allow. Ironic
if you have more luck with that then the actual client-side AJAXy
requests that Google said they required!)
Thanks for alerting us to X-forwarded-for, that's a good idea.
Jonathan
Joe Hourcle wrote:
On Tue, 18 Mar 2008, Jonathan Rochkind wrote:
Wait, now ALL of your clients calls are coming from one single IP?
Surely that will trigger Googles detectors, if the NAT did. Keep us
updated though.
I don't know what Peter's exact implementation is, but they might relax
the limits when they see an 'X-Forwarded-For' header, or something
else to
suggest it's coming through a proxy. It used to be pretty common when
writing rate limiting code to use X-Forwarded-For in place of
HTTP_ADDR so
you didn't accidentally ban groups behind proxies. (of course, I don't
know if the X-Forwarded-For value is something that's not routable (in
10/8), or the NAT IP, so it might still look like 1 IP address behind a
proxy)
Also, by using a caching proxy (if the responses are cachable), the total
number of requests going to Google might be reduced.
I would assume they'd need to have some consideration for proxies, as I
remember the days when AOL's proxy servers channeled all requests through
less than a dozen unique IP addresses. (or at least, those were the only
ones hitting my servers)
-Joe
--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu
--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu