On 11/10/22, Jessica Smith <12jessicasmit...@gmail.com> wrote: > > Weird issue I've found on Windows images in Azure Devops Pipelines and > Github actions. Printing Unicode characters fails on these images because, > for some reason, the encoding is mapped to cp1252. What is particularly > weird about the code page being set to 1252 is that if you execute "chcp" > it shows that the code page is 65001.
If stdout isn't a console (e.g. a pipe), it defaults to using the process code page (i.e. CP_ACP), such as legacy code page 1252 (extended Latin-1). You can override just sys.std* to UTF-8 by setting the environment variable `PYTHONIOENCODING=UTF-8`. You can override all I/O to use UTF-8 by setting `PYTHONUTF8=1`, or by passing the command-line option `-X utf8`. Background The locale system in Windows supports a common system locale, plus a separate locale for each user. By default the process code page is based on the system locale, and the thread code page (i.e. CP_THREAD_ACP) is based on the user locale. The default locale of the Universal C runtime combines the user locale with the process code page. (This combination may be inconsistent.) In Windows 10 and later, the default process and thread code pages can be configured to use CP_UTF8 (65001). Applications can also override them to UTF-8 in their manifest via the "ActiveCodePage" setting. In either case, if the process code page is UTF-8, the C runtime will use UTF-8 for its default locale encoding (e.g. "en_uk.utf8"). Unlike some frameworks, Python has never used the console input code page or output code page as a locale encoding. Personally, I wouldn't want Python to default to that old MS-DOS behavior. However, I'd be in favor of supporting a "console" encoding that's based on the console input code page that's returned by GetConsoleCP(). If the process doesn't have a console session, the "console" encoding would fall back on the process code page from GetACP(). -- https://mail.python.org/mailman/listinfo/python-list