Hi Jim, This is the Transfer-Encoding: chunked I was writing about:
http://tools.ietf.org/html/rfc2616#section-3.6.1 On Wed, Jul 3, 2013 at 11:34 AM, Jim Schueler <jschue...@eloquency.com>wrote: > I played around with chunking recently in the context of media streaming: > The client is only requesting a "chunk" of data. "Chunking" is how media > players perform a "seek". It was originally implemented for FTP transfers: > E.g, to transfer a large file in (say 10K) chunks. In the case that you > describe below, if no Content-Length is specified, that indicates "send the > remainder". > > From what I know, a "chunk" request header is used this way to specify the > server response. It does not reflect anything about the data included in > the body of the request. So first, I would ask if you're confused about > this request information. > > Hypothetically, some browsers might try to upload large files in small > chunks and the "chunk" header might reflect a push transfer. I don't know > if "chunk" is ever used for this purpose. But it would require the > following characteristics: > > 1. The browser would need to originally inquire if the server is > capable of this type of request. > 2. Each chunk of data will arrive in a separate and independent HTTP > request. Not necessarily in the order they were sent. > 3. Two or more requests may be handled by separate processes > simultaneously that can't be written into a single destination. > 4. Somehow the server needs to request a resend if a chunk is missing. > Solving this problem requires an imaginitive use of HTTP. > > Sounds messy. But might be appropriate for 100M+ sized uploads. This > *may* reflect your situation. Can you please confirm? > > For a single process, the incoming content-length is unnecessary. Buffered > I/O automatically knows when transmission is complete. The read() argument > is the buffer size, not the content length. Whether you spool the buffer > to disk or simply enlarge the buffer should be determined by your hardware > capabilities. This is standard IO behavior that has nothing to do with > HTTP chunk. Without a "Content-Length" header, after looping your read() > operation, determine the length of the aggregate data and pass that to > Catalyst. > > But if you're confident that the complete request spans several smaller > (chunked) HTTP requests, you'll need to address all the problems I've > described above, plus the problem of re-assembling the whole thing for > Catalyst. I don't know anything about Plack, maybe it can perform all this > required magic. > > Otherwise, if the whole purpose of the Plack temporary file is to pass a > file handle, you can pass a buffer as a file handle. Used to be > IO::String, but now that functionality is built into the core. > > By your last paragraph, I'm really lost. Since you're already passing the > request as a file handle, I'm guessing that Catalyst creates the > tempororary file for the *response* body. Can you please clarify? Also, > what do you mean by "de-chunking"? Is that the same think as re-assembling? > > Wish I could give a better answer. Let me know if this helps. > > -Jim > > > > On Tue, 2 Jul 2013, Bill Moseley wrote: > > For requests that are chunked (Transfer-Encoding: chunked and no >> Content-Length header) calling $r->read returns unchunked data from the >> socket. >> That's indeed handy. Is that mod_perl doing that un-chunking or is it >> Apache? >> >> But, it leads to some questions. >> >> First, if $r->read reads unchunked data then why is there a >> Transfer-Encoding header saying that the content is chunked? Shouldn't >> that header be removed? How does one know if the content is chunked or >> not, otherwise? >> >> Second, if there's no Content-Length header then how does one know how >> much >> data to read using $r->read? >> >> One answer is until $r->read returns zero bytes, of course. But, is >> that guaranteed to always be the case, even for, say, pipelined requests? >> >> My guess is yes because whatever is de-chunking the request knows to stop >> after reading the last chunk, trailer and empty line. Can >> anyone elaborate >> on how Apache/mod_perl is doing this? >> >> >> Perhaps I'm approaching this incorrectly, but this is all a bit untidy. >> >> I'm using Catalyst and Catalyst needs a Content-Length. So, I have a >> Plack >> Middleware component that creates a temporary file writing the buffer from >> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass >> this file handle onto Catalyst. >> >> Then, for some content-types, Catalyst (via HTTP::Body) writes the body to >> another temp file. I don't know how Apache/mod_perl does its >> de-chunking, >> but I can call $r->read with a huge buffer length and Apache returns that. >> So, maybe Apache is buffering to disk, too. >> >> In other words, for each tiny chunked JSON POST or PUT I'm creating two >> (or >> three?) temp files which doesn't seem ideal. >> >> >> -- >> Bill Moseley >> mose...@hank.org >> >> -- Bill Moseley mose...@hank.org