I see posts about this issue going back years, so sorry if I'm kicking a 
dead horse, but I haven't been able to find any resolution.

We have a paid app on app engine we've been using to serve a commercial web 
app for 3 years. We have one application that serves different content for 
different clients. Each client has reverse proxy set up on their web server 
to fetch the content from our custom domain on app engine. We use reverse 
proxy simply to mask our domain to the clients' domains. There is no 
caching, and the reverse proxy is Apache2 with out of the box configuration.

On March 26, after 2 years of happily serving content to a particular 
client's reverse proxy, Google for some reason decided that this server was 
violating its Terms of Service and started denying content to that client's 
reverse proxy, redirecting users to the www.google.com/sorry/misc page with 
the message that: "Our systems have detected unusual traffic from your 
computer network." This of course caused our application to be totally 
unusable. We sent requests to Google for more information and heard 
nothing. The next day App Engine decided that particular server was ok 
again and resumed serving our content to the problem server.

Our app is very low volume, averaging about .05 requests/second. There were 
no traffic spikes that day. There were no configuration changes to the 
reverse proxy or any of our infrastructure.

The primary information I can find on the issue is here: 
http://support.google.com/websearch/bin/answer.py?hl=en&answer=86640&rd=1.

That page suggests that the client's server was doing one of these things:

   -     Sending automated queries
   -     Using software that sends queries to Google to determine how a 
   website or webpage ranks on Google for various queries
   -     'Meta searching' Google
   -     Performing 'offline' searches on Google

I could find no evidence of any requests being sent to Google search. There 
were open requests to one of Google's nameservers, presumably to look up 
our app's ip from its Google apps custom domain name. Surely that isn't a 
violation of Terms of Service. We found no malware on the machine. So at 
this point we have no idea why Google stopped serving the content to that 
particular server, or why it resumed service. Additionally all our other 
clients' reverse proxies continued to work fine. There was even another 
reverse proxy successfully fetching the same content that Google was 
denying to the other proxy.

Searching through previous posts, the best information I can gather is that 
maybe our proxies headers are malformed and Google doesn't like them. Why 
would Google randomly complain after 2 years of happily serving content to 
this same proxy with the same headers?

Previous posts described this problem as a landmine, where stepping in the 
wrong place can trigger it. Seems more like a surprise missile attack to me 
because we were simply walking the same path we'd walked every day for 2 
years when everything blew up.

Obviously this is totally unacceptable. We can't very well offer a 
commercial service to clients with the caveat that it might blow up at any 
time, and we have no idea when or why.

I also don't understand the connection between Google Search's Terms of 
Service and my paid App Engine app. Why does Google deny service to my paid 
application when it thinks some machine is violating its search policies??? 
Even if that machine were violating its search policies, if I want to serve 
content to the violating machine from my totally un-Google-search-related 
web app, I should be able to. Granted a DOS situation could be a valid 
reason for denying service to my app engine app, but violating Google's 
search policies is totally unrelated to my app engine app, and I should be 
able to serve content from my paid application to whomever I want.

Can Google or anyone here on the forum shed some light on why this might 
have happened and what I can do to prevent it? Will turning on PageSpeed 
make a difference, since presumably content would be served by edge caches 
and requests wouldn't hit the app engine instance all the time?

This issue has been around for years and clearly is still a huge problem. 
It would be great to get some transparency.

Thanks for any help,
Peter

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to