I'm trying to load a remote gzip file onto the Python environment, but
the downloaded file is already unzipped. Hence it gets chopped (or get
a ResponseTooLargeError).

Here is my code.

================================================================================
from google.appengine.ext import webapp
import logging

class Test(webapp.RequestHandler):
    def get(self):
        import urllib2, gzip, StringIO

        req = urllib2.Request('http://atsushi-takayama.com/siteinfo/
wedataAutoPagerize.json.gz')
        req.add_header('Accept', 'application/x-gzip')
        response = urllib2.urlopen(req)
        raw_data = response.read()
        logging.debug('data length :' + str(len(raw_data)))
        stream = StringIO.StringIO(raw_data)
        decompressor = gzip.GzipFile(fileobj=stream)
        json = decompressor.read()
        logging.debug('json length :' + str(len(json)))

application = webapp.WSGIApplication([(r'/siteinfo/test',
Test)],debug=False)

def main():
    run_wsgi_app(application)

if __name__ == "__main__":
    main()

================================================================================

This is the error I get.

================================================================================
DEBUG:root:Could not import "icglue": Disallowed C-extension or built-
in module
WARNING:root:Stripped prohibited headers from URLFetch request:
['Host']
DEBUG:root:Making HTTP request: host = atsushi-takayama.com, url =
http://atsushi-takayama.com/siteinfo/wedataAutoPagerize.json.gz,
payload = None, headers = {'Accept-Encoding': 'gzip', 'Connection':
'close', 'Accept': 'application/x-gzip', 'User-Agent': 'Python-urllib/
2.5 AppEngine-Google; (+http://code.google.com/appengine)', 'Host':
'atsushi-takayama.com', 'Referer': 'http://localhost/'}
DEBUG:root:data length :1062763
ERROR:root:Not a gzipped file
Traceback (most recent call last):
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/
GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/
google/appengine/ext/webapp/__init__.py", line 507, in __call__
    handler.get(*groups)
  File "/Users/atsushi/Dropbox/Programming/gae/Atsushuu/siteinfo.py",
line 279, in get
    json = decompressor.read()
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/
python2.5/gzip.py", line 220, in read
    self._read(readsize)
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/
python2.5/gzip.py", line 263, in _read
    self._read_gzip_header()
  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/
python2.5/gzip.py", line 164, in _read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
================================================================================

The data length 1062763 is the file size of the unzipped file.

In the remote environment, I get this.

================================================================================
D
08-11 02:03PM 40.316
data length :1048576
E
08-11 02:03PM 40.320
Not a gzipped file
Traceback (most recent call last):
  File "/base/python_lib/versions/1/google/appengine/ext/webapp/
__init__.py", line 507, in __call__
    handler.get(*groups)
  File "/base/data/home/apps/atsushuu/1.335550907989224563/
siteinfo.py", line 279, in get
    json = decompressor.read()
  File "/base/python_dist/lib/python2.5/gzip.py", line 220, in read
    self._read(readsize)
  File "/base/python_dist/lib/python2.5/gzip.py", line 263, in _read
    self._read_gzip_header()
  File "/base/python_dist/lib/python2.5/gzip.py", line 164, in
_read_gzip_header
    raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
================================================================================

It doesn't matter if I do req.add_header('Accept', 'application/x-
gzip'), req.add_header('Accept-encoding', 'gzip') or don't set
anything extra at all. (The request is already gzip acceptable)

Notice now the data length is 1048576 (=1MB). If I use urlfetch
instead of urllib2, the local error message is the same, but the
remote server returns ResponseTooLargeError. This is exactly what
happens when I try to download a unzipped file.

Does anyone know how to download a gzip file and expand at the App
Engine environment?

I'm using App Engine SDK 1.2.4, Mac OS X 10.5.7, Python 2.5.1

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to