On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop <juraj.su...@gmail.com> wrote: > On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano <st...@pearwood.info> > wrote: >> First, "utf16_string" confuses me. What is it? If it is a Unicode >> string, i.e.: > > It is a Unicode string which happens to contain code points outside U+00FF > (as with the TTF example above), so that it triggers the (at least) 2-bytes > memory representation in CPython 3.3+. I agree, I chose the variable name > poorly, my bad.
When I'm talking about Unicode strings based on their maximum codepoint, I usually call them something like "ASCII string", "Latin-1 string", "BMP string", and "SMP string". Still not wholly accurate, but less confusing than naming an encoding... oh wait, two of those _are_ encodings :| But you could use "narrow string" for the first two. Or "string(0..127)" for ASCII, "string(0..255)" for Latin-1, and then for consistency "string(0..65535)" and "string(0..1114111)" for the others, except that I doubt that'd be helpful :) At any rate, "BMP" as a term for "includes characters outside of Latin-1 but all on the Basic Multilingual Plane" would probably be close enough to get away with. ChrisA _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com