> I think you want to use codePointCount() to count the Unicode code points.
> length() returns Unicode code units.
> 
> As http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html explains:
> 
> In the J2SE API documentation, Unicode code point is used for character
> values in the range between U+0000 and U+10FFFF, and Unicode code unit is
> used for 16-bit char values that are code units of the UTF-16 encoding.

So you would like to contribute a function codePointCount to Python's
standard library? Go ahead.

Regards,
Martin

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to