Hey Taengoo,

Glad to hear that. I've processed the issue and should update that thread 
shortly with a special number identifying the feature request so that the 
thread can be updated when progress is made. 

I also appreciate that it's not always possible to set User-Agent: gzip, so 
point taken there. I look forward to seeing where this goes, since as you 
say, compressed content is one of the most important performance benefits 
one can implement.

Best wishes,

Nick

On Tuesday, June 30, 2015 at 12:06:14 AM UTC-4, Taengoo Taengstagram wrote:
>
> I've logged it as issue #12104 
> https://code.google.com/p/googleappengine/issues/detail?id=12104
>
> Thanks for pointing out the presence of a whitelist. This explains why 
> I've seen uncompressed responses in the logs to possibly lesser known 
> mobile useragents such as custom embedded webviews. This is unfortunate 
> when it is precisely these mobile devices which will stand to gain the most 
> from compressed content.
>
> Also to note, application/site owners are rarely in a position to request 
> that crawlers/users modify their user agent string to comply with such a 
> specific requirement for GAE.
>
>
> On Tuesday, June 30, 2015 at 3:27:59 AM UTC+8, Nick (Cloud Platform 
> Support) wrote:
>>
>> Hey Taengoo,
>>
>> It seems as though you may have stumbled on a valid Feature Request in 
>> the making. In the docs, it's explained that serving content-encoding: 
>> gzip responses is done based on a combination of User-Agent and 
>> Accept-Encoding headers 
>> <https://cloud.google.com/appengine/kb/general#compression>, however it 
>> appears that the Twitterbot UA string doesn't pass the test. 
>>
>> Attached is a .tar.gz containing an example app you can deploy, and a 
>> script you can use, to test this behaviour on App Engine. If you change the 
>> application id in app.yaml inside the app/ directory, you can deploy the 
>> app. At that point, you'll want to run :
>>
>> ./curl-uas.sh 1.testheaders.APPID.appspot.com
>>
>> Where your APPID will be an actual app id. 
>>
>> This script runs through the user-agents in user-agents.txt, which 
>> contain the most statistically-popular UA strings on the web at the moment, 
>> along with several test values. You'll notice that your observations are 
>> replicated for Twitterbot-style UA strings, while the special User-Agent 
>> "gzip", as explained in the docs, can force compression.
>>
>> I think you should open a Feature Request thread in the public issue 
>> tracker <http://code.google.com/p/googleappengine/issues/list> to either 
>> have the Twitterbot UA included in the list of those which can accept gzip 
>> if they request it via Accept-Encoding, or to simply have the 
>> Accept-Encoding header be respected.
>>
>> If possible, you could modify your Twitterbot to use UA "gzip", in order 
>> to simply get it working today.
>>
>> Best wishes,
>>
>> Nick
>>
>> On Monday, June 29, 2015 at 6:27:04 AM UTC-4, Taengoo Taengstagram wrote:
>>>
>>> I've noticed for when Twitterbot crawls my app on GAE, the response does 
>>> not appear to be gzipped (as seen by the response bytes size in GAE logs). 
>>> I've tested this with other apps deployed on the *.appspot.com, for 
>>> example https://ga-dev-tools.appspot.com/.
>>>
>>> To illustrate, I'm using a test user agent  "Twitterbot/9.0", although 
>>> the actual Twitter user agent is "Twitterbot/1.0".
>>>
>>> # Test case 1: With a generic Mozilla useragent Mozilla/9.0 + gzip 
>>> headers, response returned is gzipped
>>> $ curl 'https://ga-dev-tools.appspot.com/' -H 'Accept-Encoding: gzip, 
>>> deflate, sdch' --compressed -A 'Mozilla/9.0' -i
>>>
>>> HTTP/1.1 200 OK
>>> Content-Type: text/html; charset=utf-8
>>> Cache-Control: no-cache
>>> Content-Encoding: gzip
>>> Vary: Accept-Encoding
>>> Date: Mon, 29 Jun 2015 10:11:35 GMT
>>> Server: Google Frontend
>>> Alternate-Protocol: 443:quic,p=1
>>> Transfer-Encoding: chunked
>>>
>>> # Test case 2: With a Twitterbot useragent Twitterbot/9.0 + gzip 
>>> headers, response returned is not gzipped
>>> $ curl 'https://ga-dev-tools.appspot.com/' -H 'Accept-Encoding: gzip, 
>>> deflate, sdch' --compressed -A 'Twitterbot/9.0' -i
>>>
>>> HTTP/1.1 200 OK
>>> Content-Type: text/html; charset=utf-8
>>> Cache-Control: no-cache
>>> Date: Mon, 29 Jun 2015 10:12:06 GMT
>>> Server: Google Frontend
>>> Content-Length: 7956
>>> Alternate-Protocol: 443:quic,p=1
>>>
>>> # Test case 3: With a Twitterbot useragent Twitterbot/9.0 + no other 
>>> headers, response returned is not gzipped
>>> $ curl 'https://ga-dev-tools.appspot.com/' -A 'Mozilla/9.0' -i
>>>
>>> HTTP/1.1 200 OK
>>> Content-Type: text/html; charset=utf-8
>>> Cache-Control: no-cache
>>> Date: Mon, 29 Jun 2015 10:13:17 GMT
>>> Server: Google Frontend
>>> Content-Length: 7956
>>> Alternate-Protocol: 443:quic,p=1
>>>
>>>
>>> You will noticed that GAE is returning identical responses for test #2 
>>> (Twitterbot) and #3 (uncompressed request). This is unexpected and rather 
>>> puzzling. Any idea why?
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/22b16807-6276-40fb-afac-5ad0c71f9657%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to