Hello! Please help to solve the problem with Cyrillic UTF-8! Windows 10 Python 3.9.10 (b332b321bbaa72bffb0207da5b7fe4c38047d3b2, Mar 16 2022, 16:03:21) [PyPy 7.3.9 with MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>> print ("АБВвба") UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 8: surrogates not allowed Visual Studio Code + PyPy 7.3.9 print ("АБВвба") ╨Р╨С╨Т╨▓╨▒╨░ != Normal output Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> print ("АБВвба") АБВвба == Normal output Because of this behavior, all actions on strings containing Cyrillic are incorrect. Is it possible to solve this problem? I tried using "setlocale" by analogy with C, assuming that the code is translated to C, but this does not work in PyPy. from locale import setlocale , LC_ALL setlocale ( LC_ALL , "ru_RU.UTF-8" ) Maybe it is necessary to add a localization check in the PyPy sources. #include <stdio.h> #include <conio.h> #include <locale.h> int main () { setlocale ( LC_ALL , "ru_RU.UTF-8" ); printf ( "АБВвба" ); _getch (); } I can't do it myself since I just started learning C, I don't have enough knowledge for this, but I really want to learn) Interestingly, when compiling to GCC (MinGW or MinGW64), Cyrillic support does not work, and Cyrillic output to C is not correct, but if you compile Clang (LLVM), everything works correctly. I am grateful in advance for any answer! Thanks! With great respect! Max.
_______________________________________________ pypy-dev mailing list -- pypy-dev@python.org To unsubscribe send an email to pypy-dev-le...@python.org https://mail.python.org/mailman3/lists/pypy-dev.python.org/ Member address: arch...@mail-archive.com