Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

And Clover Tue, 22 Sep 2009 09:07:26 -0700

Graham wrote:

> Armin has fast asleep now, so my shift.


Heh. It's a multiple-man job keeping up with this monster thread!

The URLs don't break.

Not in themselves. Just the language of the PEP implies that to fix themup would contravene the spec:


>> The application MUST use [the encoding guess for PATH_INFO] to decode
>> the ``'QUERY_STRING'`` as well.

This isn't appropriate even as a SHOULD: the guessed encoding forPATH_INFO is very likely to be wrong, in particular for cases where thepath was purely ASCII.

The application (or a library/framework acting on its behalf) should beallowed to decode QUERY_STRING using whatever encoding it is expecting.Disallowing using anything other than utf-8 (and iso-8859-1 in a veryunreliable way) makes it impossible to have queries in any otherencoding at all and still comply with the spec, which is undesirable.

If this sentence is removed, and `wsgi.uri_encoding` is guaranteed to beone of:


  a. definitive and reliable, or
  b. missing/None

I'm pretty much happy. What I don't want is that half the future-WSGIservers/gateways decide they have to provide *some* value for`wsgi.uri_encoding` even if they're not quite sure if it's the rightone. Then we're back to square one.

if it is known that an application or some subset of
URLs will always be receiving a request as non UTF-8, then it should
employ code in those cases to always transcode it to the required
encoding.

Yep, agreed. I think the PEP should clarify that; at the moment it issaying that a transcode is something you should only do for theiso-8859-1 case, but if you actually followed that advice you'd gethighly inconsistent results. Perhaps we're at cross-purposes as to whatexactly consistutes 'middleware'...

The other fallback is that a specific WSGI server could elect to
provide an option to not use 'UTF-8' as the first choice for decoding

I really, *really* hope this does not happen. That just brings us moredeployment heartaches.

Whether surrogateescape gives a better solution I have no idea at this
point

Yeah... I'm highly suspicious of surrogateescape in a web context andpersonally my code will be deliberately filtering all such charactersout. I can see it being a possible way to smuggle unwanted sequences(such as overlongs) through filters, potentially causing endlesssecurity problems. But we'll see...


--
And Clover
mailto:a...@doxdesk.com
http://www.doxdesk.com/

_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

Reply via email to