[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-11 Thread Jim Jewett


Jim Jewett  added the comment:

@Ben Griffin -- Unicode has defined astral characters for a while, but they 
were explicitly intended for rare characters, with any living languages 
intended for the basic plane.  It is only the most recent releases of unicode 
that have broken the "most people won't need this" expectation, so it wasn't 
unreasonable for languages targeting memory-constrained devices to make astral 
support at best a compile-time operation.  

I've seen a draft for an upcoming spec update of an old but still-supported 
language (extended Gerber, for photoplotting machines) that "handles" this 
simply by clarifying that their unicode support is limited to characters < 65K. 
 Given that their use of unicode is essentially limited to comments, and there 
is plenty of hardware that can't be updated ... this is may well be correct.

Python itself does the right thing, and tcl can't do the right thing anyhow 
without font support ... so this may be fixed in less time than it would take 
to replace Tk/Tcl.  If you need a faster workaround, consider a 
private-use-area and private font.

--
nosy: +Jim.Jewett

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-10 Thread Terry J. Reedy

Terry J. Reedy  added the comment:

Python internally uses an encoding system that represents all unicode chars 
efficiently, including O(1) indexing.  It is not utf-8, which does not do O(1) 
indexing.

There is already an issue about upgrading (separately) the Python Windows and 
macOS installers to install tcl/tk 8.6.9.

With the currrent 8.6.9 and probably earlier, and since an important tkinter 
patch last fall for #13153, a tkinter/tk text widget will display astral 
characters that the font in use can produce.  For example, in 3.9.0, I see the 
TV set printed in IDLE
>>> '\U0001f4bb'
''
but not in the Windows Console Python REPL, which shows 'box space box box'.

However, astral characters discombobutate editing (#39126),at least on Windows, 
they are counted as 2 or 4 chars.  The difference between behavior before and 
after Serhiy's patch and between display and editing likely explains different 
reports on SO.

--
nosy: +terry.reedy
resolution:  -> third party
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-06 Thread Ben Griffin

Ben Griffin  added the comment:

Wow, well if you are right, then TCL/TK is a showstopper for us, and we will 
have to consider an alternative to tkinter. 

Frankly, I am aghast that any active software would be limited to fixed width 
characters.

We moved our languages over to multiwidth (utf-8) back in 2003: most of the 
changes were restricted to a handful of string functions (strcut, strlen, 
etc.). Compiling TCL to use 4 byte chars isn’t really a solution either.

What confuses me is that there are several people on SO who are saying ‘works 
for me’.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-06 Thread E. Paine


E. Paine  added the comment:

Sorry, the point I was trying to make was that, unlike UTF-8, Tcl doesn't 
support variable length characters and they are instead fixed at 16 bits (by 
default). So, while Python and UTF-8 are perfectly happy with the emoji, unless 
Tcl is compiled with a particular build flag it will not process the character 
correctly (hence why I said it was surprising that Chip showed at all). I have 
tested on Tcl 8.6.10 and encountered the same problem described.

A further quote (granted, also old, but I cannot find anything to suggest this 
behaviour has been changed):
"Tcl can (currently) only represent characters within the Basic Multilingual 
Plane of Unicode, so there's no way that you can even feed an U+1 into 
encoding convertto :-(. Fixing that is non-trivial, since some parts of Tcl 
(the C library) require a representation of strings where all characters take 
up the same number of bytes. It is possible to compile Tcl with that "number of 
bytes" set to 4 (meaning 32 bits per character), but it's rather wasteful, and 
has been reported not entirely compatible with Tk." 
[https://wiki.tcl-lang.org/page/utf-8]

If I can find the build flag mentioned, I will post it here for future 
reference.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-05 Thread Ben Griffin

Ben Griffin  added the comment:

Erm, I don’t rightly know how to parse epaine’s comment, as it seems to relate 
to a version of Unicode from over a decade ago, and a wiki page that was 
written 12 years ago.

IIRC Python 3 was (IMO rightly) developed to default to UTF-8, and according to 
a much more recently edited article (https://en.m.wikipedia.org/wiki/UTF-8), a 
normative UTF-8 parser can handle any of the million+ Unicode characters, 
including emoji.

As I pointed out in the bug report, and as mentioned by contributors on SO, TCL 
has seems to have fixed these issues by 8.6.10.

If epaine is correct and TCL CANTFIX/WONTFIX normative utf-8 - then maybe it’s 
time to drop the strong relationship that Python has with tkinter. However Im 
pretty sure that there is no need for such a drastic measure: the UTF-8 
algorithm isn’t that complex.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-05 Thread E. Paine


E. Paine  added the comment:

This is a Tcl issue, as Tcl is designed for characters up to 16 bits. The fact 
that Chip is showing at all is very surprising, though any character outside of 
this 16-bit range should be considered unpredictable.

"The majority of characters used in the human languages of the world have 
character codes between 0 and 65535, and are known as the Basic Multilingual 
Plane  (BMP). Currently a default build of Tcl is only capable of handling 
these characters, but work is underway to change that, and workarounds 
requiring non-default build-time configuration options exist." 
[https://wiki.tcl-lang.org/page/Unicode]

--
nosy: +epaine

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-05 Thread Ned Deily


Change by Ned Deily :


--
assignee:  -> ned.deily

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-05 Thread SilentGhost


Change by SilentGhost :


--
components: +macOS
nosy: +ned.deily, ronaldoussoren

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41212] Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8

2020-07-05 Thread Ben Griffin

New submission from Ben Griffin :

https://stackoverflow.com/questions/62713741/tkinter-and-32-bit-unicode-duplicating-any-fix

Emoji are doubling up when using canvas.create_text()
This is reported to work on tcl/tk 8.6.10 but there’s no. Way to upgrade tcl/tk 
using the standard installs from the python.org site

--
components: Tkinter
files: Emoji.py.txt
messages: 373019
nosy: Ben Griffin
priority: normal
severity: normal
status: open
title: Emoji Unicode failing in standard release of Python 3.8.3 / tkinter 8.6.8
type: behavior
versions: Python 3.8
Added file: https://bugs.python.org/file49297/Emoji.py.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com