Hello everyone,
I am new to R and I have experienced some bugs when using Rterm on Windows. Chinese characters in the console output are discarded by Rterm, and trying to type them into the console will crash the Rterm application. ---ENVIRONMENT--- Platform = x86_64-w64-mingw32 OS = Windows 10 Pro 1709 chs R version = 3.4.3 Active code page = 936 (Simplified Chinese) ---STEPS TO REPRODUCE--- 1. Run cmd and start bin\x64\R.exe 2. Note that all Chinese characters in the startup banner are missing 3. > Sys.getlocale() [1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified) _China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_ TIME=Chinese (Simplified)_China.936" 4. > print("ABC\u4f60\u597dDEF") [1] "ABCDEF" (Unicode code points for "����") 5. Use Microsoft Pinyin IME to type "����" into the console. An error message appeared: > invalid multibyte character in mbcs_get_next Then the program crashed. My debugger reported a heap corruption, displayed as follows: 0x00007FFE2F3687BB (ntdll.dll) (Rterm.exe ��)����δ���������쳣: 0xC0000374: ������ (����: 0x00007FFE2F3CC6E0)�� However, if the text is pasted into the console, it will not crash. ---ADDITIONAL INFO--- Both 32-bit and 64-bit versions have the same problem. I attached a debugger to observe Rterm's behavior. The command in step 4 produced the following calling sequence of C library function "fputc": fputc ( 91, 0x00007ffe2d1aea40 ) //'[' fputc ( 49, 0x00007ffe2d1aea40 ) //'1' fputc ( 93, 0x00007ffe2d1aea40 ) //']' //fflush ( 0x00007ffe2d1aea40 ) fputc ( 32, 0x00007ffe2d1aea40 ) //' ' fputc ( 34, 0x00007ffe2d1aea40 ) //'\"' fputc ( 65, 0x00007ffe2d1aea40 ) //'A' fputc ( 66, 0x00007ffe2d1aea40 ) //'B' fputc ( 67, 0x00007ffe2d1aea40 ) //'C' fputc ( 196, 0x00007ffe2d1aea40 ) //FAILED! fputc ( 227, 0x00007ffe2d1aea40 ) //FAILED! fputc ( 186, 0x00007ffe2d1aea40 ) //FAILED! fputc ( 195, 0x00007ffe2d1aea40 ) //FAILED! fputc ( 68, 0x00007ffe2d1aea40 ) //'D' fputc ( 69, 0x00007ffe2d1aea40 ) //'E' fputc ( 70, 0x00007ffe2d1aea40 ) //'F' fputc ( 34, 0x00007ffe2d1aea40 ) //'\"' //fflush ( 0x00007ffe2d1aea40 ) fputc ( 10, 0x00007ffe2d1aea40 ) //'\n' {196, 227, 186, 195} or {C4 E3 BA C3} is multi-byte-encoded "����" in GBK (Code page 936). These calls failed with a Windows error code 28 (No space left on device), while the subsequent calls to fputc succeeded. Then I used C++ to implement a terminal front-end with REmbedded facilities. R outputs were simply printf-ed to stdout. Everything worked as expected: Initializing R environment R version 3.4.3 detected > print("���ã�����һ���й�ѧ����R is great!") [1] "���ã�����һ���й�ѧ����R is great!" > Sys.getlocale() [1] "LC_COLLATE=Chinese (Simplified)_China.936;LC_CTYPE=Chinese (Simplified) _China.936;LC_MONETARY=Chinese (Simplified)_China.936;LC_NUMERIC=C;LC_ TIME=Chinese (Simplified)_China.936" > I hope these information are helpful. Best regards, AzureFx [[alternative HTML version deleted]]
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel