On Apr 14, 6:55 pm, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:
> On Apr 15, 7:49 am, Alex Loddengaard <a...@cloudera.com> wrote:
>
>
>
> > I've found several messages on this list discussing ways to send large files
> > in a HttpResponse. One can use FileWrapper, or one can use a generator and
> > yield chunks of the large file. What about the case when the large file is
> > generated at HTTP request time? In this case, it would be annoying to have
> > the user wait for the page to generate the large file and then stream the
> > file. Instead we would want a way to start the HTTP response (so that the
> > user gets the download dialogue), generate the large file, and then stream
> > the file. Let's take the following example:
>
> > def create_tarball():
>
> > > path = create_some_big_tarball()
>
> > > chunk = None
> > > fh = open(path, 'r')
> > > while True:
> > > chunk = fh.read(1024 * 128)
> > > if chunk == '':
> > > break
> > > yield chunk
>
> > > def sample_view(request):
> > > response = HttpResponse(create_tarball(),
> > > mimetype='application/x-compressed')
> > > response['Content-Disposition'] = "attachment;filename=mytarball.tar.gz"
>
> > The above example nearly accomplishes what we want, but it doesn't start the
> > HTTP response before the tarball is created, hence making the user wait a
> > long time before the download dialogue box shows up. Let's try something
> > like this (notice the addition of a noop yield):
>
> > def create_tarball():
>
> > yield '' # noop to send the HTTP headers
>
> > > path = create_some_big_tarball()
>
> > > chunk = None
> > > fh = open(path, 'r')
> > > while True:
> > > chunk = fh.read(1024 * 128)
> > > if chunk == '':
> > > break
> > > yield chunk
>
> > > def sample_view(request):
> > > response = HttpResponse(create_tarball(),
> > > mimetype='application/x-compressed')
> > > response['Content-Disposition'] = "attachment;filename=mytarball.tar.gz"
>
> > The issue with the above example is that the "yield ''" seems to be
> > ignored. HTTP headers are not sent before the tarball is created.
> > Similarly, "yield ' '" and "yield None" don't work, because they corrupt the
> > tarball (HttpResponse calls str() on the iterable items given to the
> > HttpResponse constructor). As a temporary solution, we're writing an empty
> > gzip file in the first yield. Our large tarball is gzipped, and since gzip
> > files can be concatenated to one and other, our hack seems to be working.
> > In the above example, replace the first "yield ''" with:
>
> > noop = StringIO.StringIO()
>
> > > empty = gzip.GzipFile(mode='w', fileobj=noop)
> > > empty.write("")
> > > empty.close()
> > > yield noop.getvalue()
>
> > I'm wondering if there is a better way to accomplish this? I don't quite
> > understand why HTTP responses are written to stdout. Possibly orthogonal to
> > that, it seems like, in theory, yielding an empty value in the generator
> > should work, because a flush is called after the HTTP headers are written.
> > Any ideas, either on how to solve this problem with the Django API, or on
> > why Django doesn't send HTTP headers on a "yield ''"?
>
> From memory, file wrappers at django level, in order to work across
> different hosting mechanisms supported, only allow a file name to be
> supplied. At the WSGI level the file wrapper actually takes a file
> like object. If you were doing this in raw WSGI, you could run your
> tar ball creation as a separately exec'd pipeline and rather than
> create a file in the file system, have tar output to the pipeline,
> ie., use '-' instead of filename. The file object resulting from the
> pipeline could then be used as input to the WSGI file wrapper object.
>
> So, if this operation isn't somehow bound into needing Django itself,
> and this is important to you, maybe you should create a separate
> little WSGI application just for this purpose.
>
> Actually, even if bound into needing Django you may still be able to
> do it. Using mod_wsgi, you could even delegate the special WSGI
> application to run in same process as Django and mount it at a URL
> which appears within Django application. Because though you are side
> stepping Django dispatch, you couldn't though have it be protected by
> Django based form authentication.
>
> Graham
Hi,
First, the FileWrapper class in django.core.servers.basehttp.py
accepts file-like objects, i.e., ones that have a read method. Which
is what leads me to suggest that your solution may be to write your
own FileWrapper class, that get the file on the first iteration.
Here's a modified, untested, version of FileWrapper:
class BigTarFileWrapper(object):
"""Wrapper to convert file-like objects to iterables"""
def __init__(self, tar_args, blksize=8192):
self.filelike = None
self.tar_args = tar_args
self.blksize = blksize
def __getitem__(self,key):
if not self.filelike:
self.filelike = get_some_big_tarball(self.tar_args)
data = self.filelike.read(self.blksize)
if data:
return data
raise IndexError
def __iter__(self):
return self
def next(self):
if not self.filelike:
self.filelike = get_some_big_tarball(self.tar_args)
data = self.filelike.read(self.blksize)
if data:
return data
raise StopIteration
Then your response becomes something this:
def sample_view(request, args):
tar_iterator = BigTarFileWrapper(args)
response = HttpResponse(tar_iterator,
mimetype='application/x-compressed')
response['Content-Disposition'] =
"attachment;filename=mytarball.tar.gz"
return response
This was inspired by the snippet that you may have seen [1], and my
experience in needing to return files from an external storage system
using my own iterator class.
--Rick
[1] http://www.djangosnippets.org/snippets/365/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to
django-users+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---