Re: Unicode sorting

Dar Scott Thu, 01 Jun 2006 16:21:58 -0700

Wow!  Great news for sorting Unicode!

On May 30, 2006, at 5:08 PM, Devin Asay wrote:

I got your code to work by making some simple changes in thesortCodeFromRussian function:

Deven, I've been processing some bits of UTF-8, and something dawnedon me that is probably known by the Unicode experts.

**** A lexical byte sort of well-formed UTF-8 will result in aUnicode code point sort! *****

That avoids the NUL problem in sort. That means that russianLex()can return the UTF-8 of the string with your character conversions.

I think the replace command will work with UTF-8, so you can evenavoid a character loop. All you need is 34 replaces and then areturn. OK, that might actually be slower than a character loop.


Dar
Unicode Sophomore


_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Re: Unicode sorting

Reply via email to