Re: mod_perl and Transfer-Encoding: chunked

Bill Moseley Wed, 03 Jul 2013 11:45:51 -0700

Hi Jim,

This is the Transfer-Encoding: chunked I was writing about:


http://tools.ietf.org/html/rfc2616#section-3.6.1



On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler <jschue...@eloquency.com>wrote:

> I played around with chunking recently in the context of media streaming:
> The client is only requesting a "chunk" of data.  "Chunking" is how media
> players perform a "seek".  It was originally implemented for FTP transfers:
>  E.g, to transfer a large file in (say 10K) chunks.  In the case that you
> describe below, if no Content-Length is specified, that indicates "send the
> remainder".
>
> From what I know, a "chunk" request header is used this way to specify the
> server response.  It does not reflect anything about the data included in
> the body of the request.  So first, I would ask if you're confused about
> this request information.
>
> Hypothetically, some browsers might try to upload large files in small
> chunks and the "chunk" header might reflect a push transfer.  I don't know
> if "chunk" is ever used for this purpose.  But it would require the
> following characteristics:
>
>   1.  The browser would need to originally inquire if the server is
>       capable of this type of request.
>   2.  Each chunk of data will arrive in a separate and independent HTTP
>       request.  Not necessarily in the order they were sent.
>   3.  Two or more requests may be handled by separate processes
>       simultaneously that can't be written into a single destination.
>   4.  Somehow the server needs to request a resend if a chunk is missing.
>       Solving this problem requires an imaginitive use of HTTP.
>
> Sounds messy.  But might be appropriate for 100M+ sized uploads.  This
> *may* reflect your situation.  Can you please confirm?
>
> For a single process, the incoming content-length is unnecessary. Buffered
> I/O automatically knows when transmission is complete.  The read() argument
> is the buffer size, not the content length.  Whether you spool the buffer
> to disk or simply enlarge the buffer should be determined by your hardware
> capabilities.  This is standard IO behavior that has nothing to do with
> HTTP chunk.  Without a "Content-Length" header, after looping your read()
> operation, determine the length of the aggregate data and pass that to
> Catalyst.
>
> But if you're confident that the complete request spans several smaller
> (chunked) HTTP requests, you'll need to address all the problems I've
> described above, plus the problem of re-assembling the whole thing for
> Catalyst.  I don't know anything about Plack, maybe it can perform all this
> required magic.
>
> Otherwise, if the whole purpose of the Plack temporary file is to pass a
> file handle, you can pass a buffer as a file handle.  Used to be
> IO::String, but now that functionality is built into the core.
>
> By your last paragraph, I'm really lost.  Since you're already passing the
> request as a file handle, I'm guessing that Catalyst creates the
> tempororary file for the *response* body.  Can you please clarify?  Also,
> what do you mean by "de-chunking"?  Is that the same think as re-assembling?
>
> Wish I could give a better answer.  Let me know if this helps.
>
> -Jim
>
>
>
> On Tue, 2 Jul 2013, Bill Moseley wrote:
>
>  For requests that are chunked (Transfer-Encoding: chunked and no
>> Content-Length header) calling $r->read returns unchunked data from the
>> socket.
>> That's indeed handy.  Is that mod_perl doing that un-chunking or is it
>> Apache?
>>
>> But, it leads to some questions.
>>
>> First, if $r->read reads unchunked data then why is there a
>> Transfer-Encoding header saying that the content is chunked?   Shouldn't
>> that header be removed?   How does one know if the content is chunked or
>> not, otherwise?
>>
>> Second, if there's no Content-Length header then how does one know how
>> much
>> data to read using $r->read?
>>
>> One answer is until $r->read returns zero bytes, of course.  But, is
>> that guaranteed to always be the case, even for, say, pipelined requests?
>>
>> My guess is yes because whatever is de-chunking the request knows to stop
>> after reading the last chunk, trailer and empty line.   Can
>> anyone elaborate
>> on how Apache/mod_perl is doing this?
>>
>>
>> Perhaps I'm approaching this incorrectly, but this is all a bit untidy.
>>
>> I'm using Catalyst and Catalyst needs a Content-Length.  So, I have a
>> Plack
>> Middleware component that creates a temporary file writing the buffer from
>> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes.  I pass
>> this file handle onto Catalyst.
>>
>> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to
>> another temp file.    I don't know how Apache/mod_perl does its
>> de-chunking,
>> but I can call $r->read with a huge buffer length and Apache returns that.
>>  So, maybe Apache is buffering to disk, too.
>>
>> In other words, for each tiny chunked JSON POST or PUT I'm creating two
>> (or
>> three?) temp files which doesn't seem ideal.
>>
>>
>> --
>> Bill Moseley
>> mose...@hank.org
>>
>>


-- 
Bill Moseley
mose...@hank.org

Re: mod_perl and Transfer-Encoding: chunked

Reply via email to