On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop <juraj.su...@gmail.com> wrote:
> On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano <st...@pearwood.info>
> wrote:
>> First, "utf16_string" confuses me. What is it? If it is a Unicode
>> string, i.e.:
>
> It is a Unicode string which happens to contain code points outside U+00FF
> (as with the TTF example above), so that it triggers the (at least) 2-bytes
> memory representation in CPython 3.3+. I agree, I chose the variable name
> poorly, my bad.

When I'm talking about Unicode strings based on their maximum
codepoint, I usually call them something like "ASCII string", "Latin-1
string", "BMP string", and "SMP string". Still not wholly accurate,
but less confusing than naming an encoding... oh wait, two of those
_are_ encodings :| But you could use "narrow string" for the first
two. Or "string(0..127)" for ASCII, "string(0..255)" for Latin-1, and
then for consistency "string(0..65535)" and "string(0..1114111)" for
the others, except that I doubt that'd be helpful :) At any rate,
"BMP" as a term for "includes characters outside of Latin-1 but all on
the Basic Multilingual Plane" would probably be close enough to get
away with.

ChrisA
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to