On Sun, Oct 27, 2019, at 03:39, Andrew Barnert via Python-ideas wrote:
> (Actually, IIRC, one of the two has a str type that, despite being 2.x, 
> is unicode rather than bytes, but with some extra undocumented 
> functionality to smuggle bytes around in a str and have it sometimes 
> work.)

I do like the way GNU Emacs represents strings - abstractly, a string can 
contain any character, or any byte > 127 distinct from a character. Concretely, 
IIRC they are represented either as pure byte strings or as UTF-8 with "bytes > 
127" represented as the extended UTF-8 representations of code points 0x3FFF80 
through 0x3FFFFF [values between 0x110000 and 0x3FFF7F are used for other 
purposes].
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GLM57Y6TC2HR4BEGXA6UPL44BULIIDTH/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to