I wrote: > Applications do produce URI's (and IRI's, etc. that need to be > converted into URI's) and do transfer them in media types like > HTML, which define how to encode a.href's and form.action's > before %-encoding them [4]. But these are not the only vectors > by which clients obtain or generate Request-URI's. > ... > As someone (Alan Kennedy?) noted at PyCon, static resources may > depend upon a filename encoding defined by the OS which is > different than that of the rest of the URI's generated/understood > by even the most coherent application. > ... > "In practical terms, character-by-character comparisons should be > done codepoint-by-codepoint after conversion to a common character > encoding." In other words, the URI spec seems to imply that the > two URI's "/a%c3%bf" and "/a%ff" may be equivalent, if the former > is u"/a\u00FF" encoded in UTF-8 and the latter is u"/a\u00FF" > encoded in ISO-8859-1. Note that WSGI 1.0 cannot speak about > this, since all environ values must be byte strings. IMO WSGI > 2 should do better in this regard. > ... > For the three reasons above, I don't think we can assume that the > application will always receive equivalent URI's encoded in a > single, foreseen encoding.
Did I say 3 reasons? I meant 4: Accept-Charset. Robert Brewer fuman...@aminus.org
_______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com