Unicode support for Windows 10 console

ktamp Sat, 01 Feb 2020 05:30:46 -0800

I've tried Nim programs in Windows 10 console and found unicode support to have 
some serious issues.


I've used Nim v1.0.6 with either MinGW-W64 v4.3.0 or MinGW-W64 v8.1.0 as the 
backend.

Before running a Nim program, **chcp** returns **" Active code page: 737"** 
(that is **" OEM Greek"**).  
After running a Nim program, **chcp** returns **" Active code page: 65001"** 
(that is **" UTF-8"**).  
So we know for sure that a Nim program successfully sets the code page to **" 
UTF-8"**.  


Now take the following short code as en example:
    
    
    import strutils
    import unicode
    
    let s = stdin.readLine
    echo s
    for cp in s.utf8:
      echo cp.toHex, ' ', if cp == "\e": "ESC" else: cp
    
    Run

With Latin character input, the code works as expected:

**Input:**
    
    
    test
    
    Run

 **Output:**
    
    
    test
    74 t
    65 e
    73 s
    74 t
    
    Run

With Greek character input, the code produces some interesting results:

**Input:**
    
    
    τεστ
    
    Run

 **Output:**
    
    
    00
    00
    00
    00
    
    Run

If **SetConsoleCP()** and **SetConsoleOutputCP()** are used to set the code 
page to **" 737"**, the code works somehow, even though the **utf8()** iterator 
does not really return UTF-8 code points:

**Input:**
    
    
    τεστ
    
    Run

 **Output:**
    
    
    τεστ
    AB τ
    9C ε
    A9 σ
    AB τ
    
    Run

On Linux the code works as expected.

What are your suggestions?

Unicode support for Windows 10 console

Reply via email to