[issue22742] IDLE shows traceback when printing non-BMP character

2019-10-08 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

And with PR 16583 it is now completely fixed. I.e. it can only fail in cases 
when the regular interactive interpreter fails too.

--
resolution:  -> fixed
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2019-10-04 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

Printing the unquoted escape representation rather than a replacement char is a 
bit strange and not what I expect from the python docs.  I could see it as a 
bug.  In any case, on Windows, it is the Python REPL that raises, but only for 
sys.stdout.

>>> import sys
>>> print('\ud800', file=sys.stderr)
\ud800
>>> print('\ud800', file=sys.stdout)
Traceback (most recent call last):
  File "", line 1, in 
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 
0: surrogates not allowed

whereas on Windows the surrogate is displayed as a box with diagonal lines ([X] 
compressed in one char) in both cases.  When copied and pasted into FireFox, 
the pasted surrogate shows as a square box containing mini D 8 0 0 chars.
>>> print('\ud800', file=sys.stdout)
�
>>> print('\ud800', file=sys.stderr)
�

I consider putting the undisplayable codepoint, rather than a replacement 
character, into the editor buffer (however tcl encodes it) so that IDLE can 
retrieve it without loss of information the proper thing for tk to do. IDLE can 
then potentially identify the character to the user.
===

An oddity though.  With

>>> import tkinter as tk
>>> r = tk.Tk()
>>> t = tk.Text(r)
>>> t.pack()
>>> t.insert('insert', 'a\ud800b')

the box is an empty square, not crossed.  But when I copy-paste 'a�b' into the 
font sample (Serhiy, making this editable was a great idea), it is crossed for 
every font I tried, even for Courier, which is what is being used in text t.

--
stage:  -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2019-10-04 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

It was fixed for all valid Unicode characters, you can still get an error when 
print a surrogate character to the stderr on Linux:

>>> import sys
>>> print('\ud800', file=sys.stderr)
Traceback (most recent call last):
  File "", line 1, in 
print('\ud800', file=sys.stderr)
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 
0: surrogates not allowed

In the Python REPL you get an escaped sequence.

>>> import sys
>>> print('\ud800', file=sys.stderr)
\ud800

--
resolution: fixed -> 
stage: resolved -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2019-10-04 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Fixed by PR 16545 (see issue13153).

--
nosy: +serhiy.storchaka
resolution:  -> fixed
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2019-04-24 Thread Martin Panter

Martin Panter  added the comment:

I haven’t looked at the code, but I suspect Idle implements a custom 
“sys.displayhook”:

>>> help(sys.displayhook)
Help on function displayhook in module idlelib.rpc:

displayhook(value)
Override standard display hook to use non-locale encoding

>>> sys.displayhook('\N{ROCKET}')
'\U0001f680'
>>> sys.__displayhook__('\N{ROCKET}')
Traceback (most recent call last):
  File "", line 1, in 
sys.__displayhook__('\N{ROCKET}')
  File "/usr/lib/python3.5/idlelib/PyShell.py", line 1344, in write
return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1-1: 
Non-BMP character not supported in Tk

--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2019-04-22 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

On my puzzlement above: repr(s) is a string of 3 characters -- s bracketed by 
quote characters.  print(repr(s)) fails.  I am not sure how s gets expanded to 
the full escape in IDLE.  ascii(s) expands all non-ascii and adds extra quotes. 
 Need to check Shell code.

In the python REPL, astral chars are not expanded to escape sequences.

>>> s='\U0001f603'
>>> s
''  # Windows REPL shows two replacement boxes instead of 


#36698 is about astral chars in exceptions messages.

>>> raise Exception(s)

results in the Exception traceback, 3 Unicodedecode tracebacks, and a restart.

--
versions: +Python 3.8 -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2017-06-19 Thread Terry J. Reedy

Changes by Terry J. Reedy :


--
assignee:  -> terry.reedy
components: +IDLE -Library (Lib)
versions: +Python 3.6, Python 3.7 -Python 2.7, Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2015-12-06 Thread irdb

Changes by irdb :


--
nosy: +irdb

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2014-10-31 Thread Terry J. Reedy

Terry J. Reedy added the comment:

I think Idle should consistently display astral chars with their \U escape.  It 
sometimes does, just not always.

 s='\U0001f680'
 s
'\U0001f680'
 str(s)
'\U0001f680'
 repr(s)
'\U0001f680'
 print(s) # gives error above.
 print(str(s))  #ditto

I thought that implicit print of expression and overt print of the same 
expression were supposed to be the same.

#21084 is also about this general issue.

--
nosy: +terry.reedy
stage:  - needs patch
versions: +Python 2.7, Python 3.4, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22742
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22742] IDLE shows traceback when printing non-BMP character

2014-10-27 Thread Alexander Belopolsky

New submission from Alexander Belopolsky:

 print(\N{ROCKET})
Traceback (most recent call last):
  File pyshell#1, line 1, in module
print(\N{ROCKET})
  File idlelib/PyShell.py, line 1352, in write
return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode character '\U0001f680' in 
position 0: Non-BMP character not supported in Tk

Shouldn't IDLE replace non-encodable characters with \uFFFD?

I think

 \N{ROCKET}
�

is user-friendlier than the traceback.

See also #14304.

--
components: Library (Lib)
messages: 230078
nosy: belopolsky
priority: normal
severity: normal
status: open
title: IDLE shows traceback when printing non-BMP character
type: behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22742
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com