One thing I did want to ask - is it worth still squashing everything down to the same case? Daphne already clears out headers with _ in them to avoid that CVE about it, and header case is never semantic, or so I thought?
Andrew On Fri, Mar 11, 2016 at 9:56 AM, Andrew Godwin <and...@aeracode.org> wrote: > > > On Fri, Mar 11, 2016 at 2:28 AM, Cory Benfield <c...@lukasa.co.uk> wrote: > >> >> On 10 Mar 2016, at 23:56, Andrew Godwin <and...@aeracode.org> wrote: >> >> I would indeed want to require servers to always fold headers together >> into a comma-separated list, as that's what the RFC says, and it then means >> applications only have to deal with one kind of multi-header! >> >> >> Wellllll….kinda? >> >> The RFC says that multiple headers are *semantically equivalent* to the >> joined form, but does not in any sense require that it be done. (The >> normative language in RFC 7230 is MAY.) >> >> I had this discussion recently with Brian Smith: while there is only one >> correct way to fold/unfold headers, anywhere on the spectrum between >> completely folded and completely unfolded is a perfectly valid >> representation of the HTTP header block. This means that there’s no *rules* >> about how a server is supposed to do it, at least from the IETF. ASGI is of >> course totally allowed to add its own rules, and requiring that they be >> folded is not terrible. >> >> FWIW, in my experience, I’ve found that “list of tuples” is really the >> most likely to be correct way to represent a header block, because it >> provides some assurances to the user that the header block has not been >> aggressively transformed from how it was sent on the wire. While the >> *rules* are that the folded representation is supposed to be semantically >> equivalent to the unfolded representation, there is nonetheless some >> information implicit in those headers being separate. >> >> My intuition when writing this kind of thing is to pass applications >> (like Django) the most meaningful representation I can, and then allow the >> application to make its own decisions about what meaning they’re willing to >> lose. That’s why I’d advocate for “list of two-tuples of bytestrings” as >> the representation. However, I don’t think there’s anything *wrong* with >> forcing the headers to be joined by the server where possible: it’s just >> not how I’d do it. ;) >> >> Set-cookie is the annoying thing here, though. That's why it's dict >> inbound and list of tuples outbound right now, and I just don't know if I >> want to make the inbound one a list of tuples too, given I do definitely >> want to force servers to concat headers together (unless I find any >> examples of that screwing things up) >> >> >> You could make the inbound one a list of tuples but still require that >> the servers concat headers. The rule then would be that it needs to be >> possible for an application to say `dict(headers)` without any loss of >> meaning. >> > > Yes, I think this is a good argument - my worry has always been that the > "no multiples" is more of a soft rule that some clients might break or some > apps might rely on the ordering/multiplicity of things, so preserving it is > _probably_ helpful (and as you say, it lets the header names go back to > bytestrings). > > I'll modify the spec and then update Daphne and Channels to match; I can > leave Channels parsing both types for a bit, at least. > > Collin's point about http2's handling of headers is on point, too - if the > new spec is deliberately thinned down to that point but no further, it's > probably wise to follow them since they know much more about it than I do. > > Andrew >
_______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com