On Sun, Dec 7, 2008 at 9:45 PM, Glenn Linderman <[EMAIL PROTECTED]> wrote:
> On approximately 12/7/2008 8:13 PM, came the following characters from the
> keyboard of Stephen J. Turnbull:
>>
>> Glenn Linderman writes:
>>
>>  > But if you are interested in checking for security issues, shouldn't
>> you  >   _first_ decode into some canonical form,
>>
>> Yes.  That's all that is being asked for: that Python do strict
>> decoding to a canonical form by default.  That's a lot to ask, as it
>> turns out, but that is what we (the minority of strict Unicode
>> adherents, that is) want.
>
>
> I have no problem with having strict validation available.  But doesn't
> validation take significantly longer than decoding?  So I think it should be
> logically decoupled... do validation when/where it is needed for security
> reasons, and allow internal [de]coding to be faster.

I'd like to see benchmarks of such a claim.


> I'm mostly indifferent about which should be the default... maybe there
> shouldn't be a default!  Use the "vUTF-8" decoder for strict validation, and
> the "fUTF-8" decoder for the faster, non-validating version.  Or something
> like that.  With appropriate documentation.  Of course, "UTF-8" already
> exists... as "fUTF-8", so for compatibility, I guess it shouldn't change...
> but it could be deprecated.
>
>
> You didn't address the issue that if the decoding to a canonical form is
> done first, many of the insecurities just go away, so why throw errors?

Unicode is intended to allow interaction between various bits of
software.  It may be that a library checked it in UTF-8, then passed
it to python.  It would be nice if the library validated too, but a
major advantage of UTF-8 is older libraries (or protocols!) intended
for ASCII need only be 8-bit clean to be repurposed for UTF-8.  Their
security checks continue to work, so long as nobody down stream
introduces problems with a non-validating decoder.


-- 
Adam Olsen, aka Rhamphoryncus
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to