Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings

eryksun Sat, 14 Nov 2015 19:03:12 -0800

On Sat, Nov 14, 2015 at 7:06 PM, Steve Dower <steve.do...@python.org> wrote:
> The native encoding on Windows has been UTF-16 since Windows NT. Obviously
> we've survived without Python tokenization support for a long time, but
> every API uses it.


Windows 2000 was the first version to have broad support for UTF-16.
Windows NT (1993) was released before UTF-16, so its Unicode support
is limited to UCS-2.

(Note that console windows still restrict each character cell to a
single WCHAR character. So a non-BMP character encoded as a UTF-16
surrogate pair always appears as two box glyphs. Of course you can
copy and paste from the console to a UTF-16 aware window just fine.)

> I've hit a few cases where it would have been handy for Python to be able to
> detect it, though nothing I couldn't work around.

Can you elaborate some example cases? I can see using UTF-16 for the
REPL in the Windows console, but a hypothetical WinConIO class could
simply transcode to and from UTF-8. Drekin's win-unicode-console
package works like this.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings

Reply via email to