Hey Taengoo,

It seems as though you may have stumbled on a valid Feature Request in the 
making. In the docs, it's explained that serving content-encoding: gzip 
responses is done based on a combination of User-Agent and Accept-Encoding 
headers <https://cloud.google.com/appengine/kb/general#compression>, 
however it appears that the Twitterbot UA string doesn't pass the test. 

Attached is a .tar.gz containing an example app you can deploy, and a 
script you can use, to test this behaviour on App Engine. If you change the 
application id in app.yaml inside the app/ directory, you can deploy the 
app. At that point, you'll want to run :

./curl-uas.sh 1.testheaders.APPID.appspot.com

Where your APPID will be an actual app id. 

This script runs through the user-agents in user-agents.txt, which contain 
the most statistically-popular UA strings on the web at the moment, along 
with several test values. You'll notice that your observations are 
replicated for Twitterbot-style UA strings, while the special User-Agent 
"gzip", as explained in the docs, can force compression.

I think you should open a Feature Request thread in the public issue tracker 
<http://code.google.com/p/googleappengine/issues/list> to either have the 
Twitterbot UA included in the list of those which can accept gzip if they 
request it via Accept-Encoding, or to simply have the Accept-Encoding 
header be respected.

If possible, you could modify your Twitterbot to use UA "gzip", in order to 
simply get it working today.

Best wishes,

Nick

On Monday, June 29, 2015 at 6:27:04 AM UTC-4, Taengoo Taengstagram wrote:
>
> I've noticed for when Twitterbot crawls my app on GAE, the response does 
> not appear to be gzipped (as seen by the response bytes size in GAE logs). 
> I've tested this with other apps deployed on the *.appspot.com, for 
> example https://ga-dev-tools.appspot.com/.
>
> To illustrate, I'm using a test user agent  "Twitterbot/9.0", although the 
> actual Twitter user agent is "Twitterbot/1.0".
>
> # Test case 1: With a generic Mozilla useragent Mozilla/9.0 + gzip 
> headers, response returned is gzipped
> $ curl 'https://ga-dev-tools.appspot.com/' -H 'Accept-Encoding: gzip, 
> deflate, sdch' --compressed -A 'Mozilla/9.0' -i
>
> HTTP/1.1 200 OK
> Content-Type: text/html; charset=utf-8
> Cache-Control: no-cache
> Content-Encoding: gzip
> Vary: Accept-Encoding
> Date: Mon, 29 Jun 2015 10:11:35 GMT
> Server: Google Frontend
> Alternate-Protocol: 443:quic,p=1
> Transfer-Encoding: chunked
>
> # Test case 2: With a Twitterbot useragent Twitterbot/9.0 + gzip headers, 
> response returned is not gzipped
> $ curl 'https://ga-dev-tools.appspot.com/' -H 'Accept-Encoding: gzip, 
> deflate, sdch' --compressed -A 'Twitterbot/9.0' -i
>
> HTTP/1.1 200 OK
> Content-Type: text/html; charset=utf-8
> Cache-Control: no-cache
> Date: Mon, 29 Jun 2015 10:12:06 GMT
> Server: Google Frontend
> Content-Length: 7956
> Alternate-Protocol: 443:quic,p=1
>
> # Test case 3: With a Twitterbot useragent Twitterbot/9.0 + no other 
> headers, response returned is not gzipped
> $ curl 'https://ga-dev-tools.appspot.com/' -A 'Mozilla/9.0' -i
>
> HTTP/1.1 200 OK
> Content-Type: text/html; charset=utf-8
> Cache-Control: no-cache
> Date: Mon, 29 Jun 2015 10:13:17 GMT
> Server: Google Frontend
> Content-Length: 7956
> Alternate-Protocol: 443:quic,p=1
>
>
> You will noticed that GAE is returning identical responses for test #2 
> (Twitterbot) and #3 (uncompressed request). This is unexpected and rather 
> puzzling. Any idea why?
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/62997f29-c562-4957-abef-630f71863512%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Attachment: test-ua-content-encoding.tar.gz
Description: Binary data

Reply via email to