On Wed, Aug 24, 2011 at 3:06 AM, Terry Reedy <tjre...@udel.edu> wrote:
> Excuse me for believing the fine 3.2 manual that says
> "Strings contain Unicode characters." (And to a naive reader, that implies
> that string iteration and indexing should produce Unicode characters.)

The naive reader also doesn't know the difference between characters,
code points and code units. It's the advanced, Unicode-aware reader
who is confused by this phrase in the docs. It should say code units;
or perhaps code units for narrow builds and code points for wide
builds. With PEP 393 we can unconditionally say code points, which is
much better. We should try to remove our use of "characters" -- or
else we should *define* our use of the term "characters" as "what the
Unicode standard calls code points".

-- 
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to