Re: [Web-SIG] WSGI 2: Decoding the Request-URI

Robert Brewer Mon, 17 Aug 2009 07:41:13 -0700

I wrote:
> Applications do produce URI's (and IRI's, etc. that need to be
> converted into URI's) and do transfer them in media types like
> HTML, which define how to encode a.href's and form.action's
> before %-encoding them [4]. But these are not the only vectors
> by which clients obtain or generate Request-URI's.
> ...
> As someone (Alan Kennedy?) noted at PyCon, static resources may
> depend upon a filename encoding defined by the OS which is
> different than that of the rest of the URI's generated/understood
> by even the most coherent application.
> ...
> "In practical terms, character-by-character comparisons should be
> done codepoint-by-codepoint after conversion to a common character
> encoding." In other words, the URI spec seems to imply that the
> two URI's "/a%c3%bf" and "/a%ff" may be equivalent, if the former
> is u"/a\u00FF" encoded in UTF-8 and the latter is u"/a\u00FF"
> encoded in ISO-8859-1. Note that WSGI 1.0 cannot speak about
> this, since all environ values must be byte strings. IMO WSGI
> 2 should do better in this regard.
> ...
> For the three reasons above, I don't think we can assume that the
> application will always receive equivalent URI's encoded in a
> single, foreseen encoding.


Did I say 3 reasons? I meant 4: Accept-Charset.


Robert Brewer
[email protected]

_______________________________________________
Web-SIG mailing list
[email protected]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2: Decoding the Request-URI

Reply via email to