[issue23424] Unicode character ends interactive session

2015-02-13 Thread Terry J. Reedy

Changes by Terry J. Reedy :


--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> windows console doesn't print or input Unicode

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23424] Unicode character ends interactive session

2015-02-09 Thread eryksun

eryksun added the comment:

This isn't a Python bug. The Windows console doesn't properly support UTF-8. 
See issue 1602 and Drekin's win-unicode-console, an alternative REPL based on 
the wide-character (UCS-2) console API.

FWIW, I attached a debugger to conhost.exe under Windows 7 to inspect what's 
happening here. In the client, the CRT's read() function calls WinAPI ReadFile. 
For a console handle this calls either ReadConsoleA or (in Windows 8+) 
NtReadFile. Either way, most of the action happens in the server process, 
conhost.exe. 

The server's input buffer is Unicode, which gets encoded to CP 65001 (UTF-8) by 
calling WideCharToMultibyte. However the server incorrectly assumes the current 
codepage is a Windows ANSI codepage with a one-to-one mapping, i.e. that each 
16-bit wchar_t maps to an 8-bit char in the current codepage. Since 'ł' gets 
UTF-8 encoded as the two-byte string b'\xc5\x82', the allocated buffer is too 
small by a byte. The server doesn't recover from this failure by allocating a 
larger buffer. It just reports back to the client process that it read 0 bytes. 
The CRT in turn sets the end-of-file (EOF) flag on the stdin FILE stream, which 
causes Python to exit 'normally'.

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23424] Unicode character ends interactive session

2015-02-09 Thread STINNER Victor

STINNER Victor added the comment:

This issue looks to be a duplicate of the issue #1602: windows console doesn't 
print or input Unicode. It's a limitation of Windows, not of Python itself. 
Python supports any Unicode character if the output is written in a file 
(encoded in UTF-8).

Workaround: use IDLE or another Python "REPL" (interactive interpreter) which 
has a better Unicode support.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23424] Unicode character ends interactive session

2015-02-09 Thread Grzegorz Abramczyk

New submission from Grzegorz Abramczyk:

Inputing some Unicode characters (like 'łąśćńó...') causes interactive session 
to abort.

When console session is set to use UTF-8 code page (65001) after diacritic 
character appears in string the session abruptly ends. Looking into debug 
output it looks like some cleanup is performed but there are no error messages 
indicating what caused problem.

Problem spotted on Windows 10 (technical preview) but I may try to replicate it 
on some released operating system.

---
C:\>chcp 1250
Active code page: 1250

C:\>python -i
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct  6 2014, 22:15:05) [MSC v.1600 32 bit 
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 'ł'
'ł'
>>> exit()

C:\>chcp 65001
Active code page: 65001

C:\>python -i
Python 3.4.2 (v3.4.2:ab2c023a9432, Oct  6 2014, 22:15:05) [MSC v.1600 32 bit 
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 'ł'


C:\

--
components: Unicode, Windows
files: -v.txt
messages: 235629
nosy: AGrzes, ezio.melotti, haypo, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: Unicode character ends interactive session
type: crash
versions: Python 3.4
Added file: http://bugs.python.org/file38061/-v.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com