André, thanks for the response: On Thu, Jul 4, 2013 at 4:06 AM, André Warnier <a...@ice-sa.com> wrote:
> > Bill Moseley wrote: > >> >> First, if $r->read reads unchunked data then why is there a >> Transfer-Encoding header saying that the content is chunked? Shouldn't >> that header be removed? >> > Looking at the RFC again the answer appears to be yes. Look at the last line in this decoding example in http://tools.ietf.org/html/rfc2616#section-19.4.6 A process for decoding the "chunked" transfer-coding (section 3.6 <http://tools.ietf.org/html/rfc2616#section-3.6>) can be represented in pseudo-code as: length := 0 read chunk-size, chunk-extension (if any) and CRLF while (chunk-size > 0) { read chunk-data and CRLF append chunk-data to entity-body length := length + chunk-size read chunk-size and CRLF } read entity-header while (entity-header not empty) { append entity-header to existing header fields read entity-header } Content-Length := length Remove "chunked" from Transfer-Encoding Apache/mod_perl is doing the first part but not updating the headers. There's more on Content-Length and Transfer-Encoding here: http://tools.ietf.org/html/rfc2616#section-4.4 How does one know if the content is chunked or not, otherwise? >> > > The real question is : does one need to know ? > Perhaps. That's an interesting question. Applications probably don't need to care. They should receive the body -- so for mod_perl that means reading data using $r->read until there's no more to read and then the app should never need to look at the Transfer-Encoding header -- or Content-Length header for that matter by that reasoning. It's a bit less clear if you think about Plack. It sits between web servers and applications. What should, say, a Plack Middleware component see in the body if the headers say Trasnfer-Encoding: chunked? The decoding probably should happen in the server<https://github.com/plack/Plack/issues/404#issuecomment-18124054>, but the headers would need to indicate that by removing the Transfer-Encoding header and adding in the Content-Length. >> Perhaps I'm approaching this incorrectly, but this is all a bit untidy. >> >> I'm using Catalyst and Catalyst needs a Content-Length. >> > > I would posit then that Catalyst is wrong (or not compatible with HTTP 1.1 > in that respect). But, Catalyst is a web application (framework) and from your point above it should not care about the encoding and just read the input stream by calling ->read(). Really, if you think about Plack, Catalyst should never make exceptions based on $ENV{MOD_PERL}. So, the separation of concerns between the web server and the app is not very clean. > So, I have a Plack > >> Middleware component that creates a temporary file writing the buffer from >> $r->read( my $buffer, 64 * 1024 ) until that returns zero bytes. I pass >> this file handle onto Catalyst. >> >> > So what you wrote then is a patch to Catalyst. > No, the Middleware component should be usable for any application. And likewise, for any web server. That's the point of Plack. Obviously, there's differences between web servers and maybe we need code that understans when running under mod_perl that the Transfer-Encoding: chunked header should be ignored, but if that code must live in Catalyst then that's really breaking the separation that Plack provides. I think the sane thing here is if Apache/mod_perl didn't provide a header saying the body is chunked, when it isn't. Otherwise, code (Plack, web apps) that receive a set of headers and a handle to read from don't really have any choice but to believe what it is told. -- Bill Moseley mose...@hank.org