Terry J. Reedy added the comment:
Byte 0, not byte 1, is the start byte, and it should be F0, as in output below.
However, I now see "invalid continuation byte'.
In 2.7.5,
# -*- coding: utf-8 -*-
s = b'𐒢' # output same if uncomment following lines
#s = u'𐒢'.encode('utf-8') # '𐒢' pasted in from 1st post
#s = u'\U000104a2'.encode('utf-8')
print(len(s))
for c in s: print(ord(c), hex(ord(c)))
>>>
4
(240, '0xf0')
(144, '0x90')
(146, '0x92')
(162, '0xa2')
I have no idea how the second pasted byte becomes ED in 3.x.
Attempting to open the file in 3.x results in a broken* 'Untitled' edit window
and the following error message in the console.
_tkinter.TclError: character U+104a2 is above the range (U+0000-U+FFFF) allowed
by Tcl
* Attempting to close the window either immediately or after entering text
results in
AttributeError: 'PyShellEditorWindow' object has no attribute 'extensions'
I have to close the initial python process to get rid of it.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue13153>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com