Re: [Python-Dev] len(chr(i)) = 2?

R. David Murray Sun, 21 Nov 2010 11:32:26 -0800

On Sun, 21 Nov 2010 10:17:57 -0800, Raymond Hettinger 
<raymond.hettin...@gmail.com> wrote:
> On Nov 21, 2010, at 9:38 AM, R. David Murray wrote:
> > I'm sorry, but I have to disagree.  As a relative unicode ignoramus,
> > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I
> > have heard about them on this list have only confused me.


[...]

> 6rom a users point-of-view, the actual encoding or encoding name
> doesn't matter much.  They just need to be able to predict the relevant
> behaviors (memory consumption and len/slicing behavior).
> 
> For the narrow build, that behavior is:
> - Characters in the BMP consume 2 bytes and count as one char
>   for purposes of len and slicing.
> - Characters above the BMP consume 4 bytes and counts as
>   two distinct chars for purpose of len and slicing.
> 
> For wide builds, all characters are 4 bytes and count as a single
> char for len and slicing.
> 
> Hope this helps,

Thank you, that nicely summarizes and confirms what I thought I knew about
wide versus narrow build.  And as I said, using the names UCS-2/UCS-4
would only *confuse* that understanding, not clarify it.

--
R. David Murray                                      www.bitdance.com
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] len(chr(i)) = 2?

Reply via email to