New submission from Amaury Forgeot d'Arc <amaur...@gmail.com>:

On wide unicode builds, '\U00010000'.isprintable() returns True, and repr() 
returns the character unmodified.
Is it a good behavior, given that very few fonts have can display this 
character?

Marc-Andre Lemburg wrote:
> The "printable" property is a Python invention, not a Unicode property,
> so we do have some freedom is deciding what is printable and what
> is not.

The current implementation considers printable """all the characters except 
those characters defined in the Unicode character database as following 
categories are considered printable.
  * Cc (Other, Control)
  * Cf (Other, Format)
  * Cs (Other, Surrogate)
  * Co (Other, Private Use)
  * Cn (Other, Not Assigned)
  * Zl Separator, Line ('\u2028', LINE SEPARATOR)
  * Zp Separator, Paragraph ('\u2029', PARAGRAPH SEPARATOR)
  * Zs (Separator, Space) other than ASCII space('\x20').
"""

We could also arbitrarily exclude all the non-BMP chars.

----------
components: Unicode
messages: 109520
nosy: amaury.forgeotdarc, ezio.melotti, lemburg
priority: normal
severity: normal
status: open
title: Should repr() print unicode characters outside the BMP?
type: behavior
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9198>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to