Re: [Python-Dev] Python-3.0, unicode, and os.environ

Glenn Linderman Sun, 07 Dec 2008 20:52:18 -0800

On approximately 12/7/2008 8:13 PM, came the following characters fromthe keyboard of Stephen J. Turnbull:

Glenn Linderman writes:
> But if you are interested in checking for security issues, shouldn't you> _first_ decode into some canonical form,
Yes.  That's all that is being asked for: that Python do strict
decoding to a canonical form by default.  That's a lot to ask, as it
turns out, but that is what we (the minority of strict Unicode
adherents, that is) want.

I have no problem with having strict validation available. But doesn'tvalidation take significantly longer than decoding? So I think itshould be logically decoupled... do validation when/where it is neededfor security reasons, and allow internal [de]coding to be faster.

I'm mostly indifferent about which should be the default... maybe thereshouldn't be a default! Use the "vUTF-8" decoder for strict validation,and the "fUTF-8" decoder for the faster, non-validating version. Orsomething like that. With appropriate documentation. Of course,"UTF-8" already exists... as "fUTF-8", so for compatibility, I guess itshouldn't change... but it could be deprecated.

You didn't address the issue that if the decoding to a canonical form isdone first, many of the insecurities just go away, so why throw errors?



--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python-3.0, unicode, and os.environ

Reply via email to