On 9/8/2011 6:15 PM, [email protected] wrote:
Oops, forgot to add the link for the gory details for Java and>  2 byte unicode:

http://java.sun.com/developer/technicalArticles/Intl/Supplementary/

This is dated 2004. Basically, they considered several options, tried out 4, and ended up sticking with char[] (sequences) as UTF-16 with char = 16 bit code unit and added 32-bit Character(int) class for low-level manipulation of code points.

I did not see the indexing problem mentioned. I get the impression that they encourage sequence forward-backward iteration (cursor-based access) rather than random-access indexing.

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to