On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote: > On approximately 12/7/2008 8:13 PM, came the following characters from the > keyboard of Stephen J. Turnbull: >> >> Glenn Linderman writes: >> >> > But if you are interested in checking for security issues, shouldn't >> you > _first_ decode into some canonical form, >> >> Yes. That's all that is being asked for: that Python do strict >> decoding to a canonical form by default. That's a lot to ask, as it >> turns out, but that is what we (the minority of strict Unicode >> adherents, that is) want. > > > I have no problem with having strict validation available. But doesn't > validation take significantly longer than decoding? So I think it should be > logically decoupled... do validation when/where it is needed for security > reasons, and allow internal [de]coding to be faster.
I'd like to see benchmarks of such a claim. > I'm mostly indifferent about which should be the default... maybe there > shouldn't be a default! Use the "vUTF-8" decoder for strict validation, and > the "fUTF-8" decoder for the faster, non-validating version. Or something > like that. With appropriate documentation. Of course, "UTF-8" already > exists... as "fUTF-8", so for compatibility, I guess it shouldn't change... > but it could be deprecated. > > > You didn't address the issue that if the decoding to a canonical form is > done first, many of the insecurities just go away, so why throw errors? Unicode is intended to allow interaction between various bits of software. It may be that a library checked it in UTF-8, then passed it to python. It would be nice if the library validated too, but a major advantage of UTF-8 is older libraries (or protocols!) intended for ASCII need only be 8-bit clean to be repurposed for UTF-8. Their security checks continue to work, so long as nobody down stream introduces problems with a non-validating decoder. -- Adam Olsen, aka Rhamphoryncus _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com