>
> On Windows, the Wide APIs are already used throughout the code base,
>  e.g. SetEnvironmentVariableW/_wenviron. If you need to find out the
> specific API for a specific functionality, please read the source code.
> [...]
>
No, I don't assume that. I assume that all functions are strictly
> available in a Wide character version, and have verified that they are.


The wide APIs use UTF-16.  UTF-16 suffers from the same problem as UTF-8:
not all sequences of words are valid UTF-16 sequences.  In particular,
sequences containing isolated surrogate pairs are not well-formed according
to the Unicode standard.  Therefore, the existence of a wide character API
function does not guarantee that the wide character strings it returns can
be converted into valid unicode strings.  And, in fact, Windows Vista
happily creates files with malformed UTF-16 encodings, and os.listdir()
happily returns them.


> If you can crash Python that way,
> nothing gets worse by this PEP - you can then *already* crash Python
> in that way.


Yes, but AFAIK, Python does not currently have functions that, as part of
correct usage and normal operation, are intended to generate malformed
unicode strings.

Under your proposal, passing the output from a correctly implemented file
system or other OS function to a correctly written library using unicode
strings may crash Python.  In order to avoid that, every library that's
built into Python would have to be checked and updated to deal with both the
Unicode standard and your extension to it.

Tom
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to