Re: [modwsgi] "Streaming" content

Deron Meranda Fri, 26 Mar 2010 10:28:57 -0700

On Fri, Mar 26, 2010 at 12:18 PM, Joonas Lehtolahti <[email protected]> wrote:
> Hey, I just tried an experiment about sending content from WSGI
> application that needs to wait for some time.
>
> Basically for starting the response the status is "200 OK" and headers
> [("Content-Type", "text/plain; encoding=utf-8")]
>
> For the actual data I returned the calling of this function that
> yields 7 kB of data every 10 seconds:


...

> Now what I would have wanted the behavior be is that the first 7 kB
> block of "At 0 s" would be sent immediately to the client, then after
> 10 seconds the second block would be sent, and so on. What actually
> happens is any web browser waits for 100 seconds before there is *any
> response* from the server, then gets the whole document.
>
> I tried setting Content-Length explicitly in the headers, but that did
> not have any effect as the actual output to client was encoded with
> gzip and the real Content-Length header represented the compressed
> size.

What is performing the compression?  mod_gzip?  or is it a WSGI
application (middleware)?

You may also consider using mod_deflate rather than mod_gzip.
Deflate maps better as a transfer-encoding (gzip is really designed
for "files", not "streams"); and also you have more control over
things like buffer and window sizes.


> Since PEP-0333 states that "WSGI servers, gateways, and
> middleware must not delay the transmission of any block; they must
> either fully transmit the block to the client, or guarantee that they
> will continue transmission even while the application is producing its
> next block." I trust that mod_wsgi itself conforms to this promise, so
> my suspect turns to the gzip procedure causing the output to be
> buffered instead of being sent to the client immediately when
> available.

Try removing the compression, are there still any delays?

Also note that gzip is really just a small wrapper around
deflate (RFC 1951) ... and that algorithm itself divides input
into variable-sized blocks which may or may not be related
to any I/O blocks (here your "blocks" are implicitly delimited by
timing gaps, not by any explicit demarcation).

Also deflate is a bit stream, not a byte stream.  So its own block
boundaries may occur in the middle of a byte, which also could
cause a delay in the output of the last partial-byte of any block.

Basically, WSGI can only make promises of behaviour through
the WSGI chain.  Once you're into Apache core things can
get re-blocked.

And also, whether or not the WSGI makes allowances, any sort
of transfer encoding (compression, encryption, re-encoding, etc.)
will almost certainly re-block the data stream and potentially
introduce timing delays if it is a block stream rather than byte
stream.


> Now naturally the question is how I could force mod_wsgi/Apache to
> send any output to the client immediately after available from the
> WSGI application?

First, simply try removing the compression filter and see if that
works.  That is almost certainly your main issue.

However, blocked-flushing of dynamic output like this is really the what
chunked transfer encoding and the HTTP 100 Continue responses are
for ... as a means of streaming and explicit blocking.

I'm not sure how well mod_wsgi supports that though.


If you're using mod_gzip, have you checked the mod_gzip_dechunk
configuration option?  You probably want that set to "no".
-- 
Deron Meranda

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Re: [modwsgi] "Streaming" content

Reply via email to