Xiang Zhang added the comment:
The table in wikipedia is somewhat complex. I find
ftp://ftp.software.ibm.com/software/globalization/documents/gb18030m.pdf and
the table in it is same as
https://pan.baidu.com/share/link?shareid=2606985291&uk=3341026630 (except 0x80)
but in English. I agree with Ma Lin bytes sequences like b'\x81\x30\xFF\x30'
are invalid.
For current implementation, you could see:
>>> invalid = b'\x81\x30\xff\x30'
>>> invalid.decode('gb18030').encode('gb18030') == invalid
False
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue29990>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com