On Sat, Nov 14, 2015 at 7:06 PM, Steve Dower <steve.do...@python.org> wrote: > The native encoding on Windows has been UTF-16 since Windows NT. Obviously > we've survived without Python tokenization support for a long time, but > every API uses it.
Windows 2000 was the first version to have broad support for UTF-16. Windows NT (1993) was released before UTF-16, so its Unicode support is limited to UCS-2. (Note that console windows still restrict each character cell to a single WCHAR character. So a non-BMP character encoded as a UTF-16 surrogate pair always appears as two box glyphs. Of course you can copy and paste from the console to a UTF-16 aware window just fine.) > I've hit a few cases where it would have been handy for Python to be able to > detect it, though nothing I couldn't work around. Can you elaborate some example cases? I can see using UTF-16 for the REPL in the Windows console, but a hypothetical WinConIO class could simply transcode to and from UTF-8. Drekin's win-unicode-console package works like this. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com