-On [20080703 17:03], Guido van Rossum ([EMAIL PROTECTED]) wrote: >I don't see an answer there to the question of whether the length() >method of a Java String object containing a single surrogate pair >returns 1 or 2; I suspect it returns 2.
As http://java.sun.com/j2se/1.5.0/docs/api/java/lang/CharSequence.html#length() states: int length() Returns the length of this character sequence. The length is the number of 16-bit chars in the sequence. But since Java switched to full UTF-16 support in 1.5.0 they extended their API since the existing methods have probably come too ingrained. E.g. codePointCount() http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html#codePointCount(char[],%20int,%20int) >The one thing that may be missing from Python is things like >interpretation of surrogates by functions like isalpha() and I'm okay >with adding that (since those have to loop over the entire string >anyway). Those would be welcome already, yes. I'll see if I can help out. -- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Fallen into ever-mourn, with these wings so torn, after your day my dawn... _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com