Earlier today I posted an article on my blog following up on some discussions of WSGI; one criticism presented was of language in PEP 333 regarding gzipping of responses by WSGI applications. Ian posted a comment which stated that the criticism was not correct, but I'm at a loss to figure out what *is* correct, so I'll bring up the question here.
In a parenthetical at the end of the section entitled "Handling the Content-Length Header", PEP 333 states: > Note: applications and middleware must not apply any kind of > Transfer-Encoding to their output, such as chunking or gzipping; as > "hop-by-hop" operations, these encodings are the province of the > actual web server/gateway. See Other HTTP Features below, for more > details. In the section "Other HTTP Features", PEP 333 states, in part: > However, because WSGI servers and applications do not communicate > via HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to > WSGI internal communications. WSGI applications must not generate > any "hop-by-hop" headers [4], attempt to use HTTP features that > would require them to generate such headers, or rely on the content > of any incoming "hop-by-hop" headers in the environ dictionary. My criticism of this is that this is at best ambiguous, and quite possibly openly misleading to readers of the PEP. The ambiguity here is that "gzip" is a valid value for the Transfer-Encoding header in HTTP (RFC 2616, Sections 3.6 and 14.41), but is also a valid value for the Content-Encoding header (RFC 2616, Sections 3.5 and 14.11). Web frameworks and libraries (in many languages, not just Python) which support gzipping of responses all seem to opt for the latter method. Additionally, Apache's mod_deflate -- which so far as I know is overwhelmingly the most common mechanism for enabling gzipping at the server level -- also opts for this method, and uses the Content-Encoding header. Given this, gzipping of responses seems to be rather universally associated, in the minds of web developers, with the Content-Encoding header, which is not a "hop-by-hop" header (RFC 2616, Section 13.5.1). As such, the immediate (and misleading) impression given to readers of PEP 333 will likely be one of: 1. PEP 333 forbids applications using Content-Encoding to signal gzipped response bodies (since it mentions gzipping as something applications specifically must not do), or 2. PEP 333 is ambiguous or contradictory on account of mentioning Transfer-Encoding and "hop-by-hop" headers in a context in which no-one uses Transfer-Encoding or a "hop-by-hop" header, or 3. This text in PEP 333 is based upon a misunderstanding of this feature of HTTP or of its use in the real world. None of these seem particularly good, and this is why I took that section of the spec to task (albeit in a much briefer and more cursory fashion, since this message is already starting to run a bit long). If I'm misreading or misunderstanding either PEP 333 or RFC 2616, I'd appreciate it if someone would explain where I've gone astray. But as it stands, I believe the text of PEP 333 quoted above is problematic and likely to lead to confusion, and (if I'm not misreading or misunderstanding it) should probably be revised to address these concerns. -- "Bureaucrat Conrad, you are technically correct -- the best kind of correct." _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com