AAAAARRRRGH! Disregard the previous post. I neglected to change the function call from the old sort function to the renamed new sort function. No wonder it was working!

I'll fix it and let you know how it REALLY works.

Devin
On Jun 1, 2006, at 5:21 PM, Dar Scott wrote:

Wow!  Great news for sorting Unicode!

On May 30, 2006, at 5:08 PM, Devin Asay wrote:

I got your code to work by making some simple changes in the sortCodeFromRussian function:

Deven, I've been processing some bits of UTF-8, and something dawned on me that is probably known by the Unicode experts.

**** A lexical byte sort of well-formed UTF-8 will result in a Unicode code point sort! *****

That avoids the NUL problem in sort. That means that russianLex() can return the UTF-8 of the string with your character conversions.

I think the replace command will work with UTF-8, so you can even avoid a character loop. All you need is 34 replaces and then a return. OK, that might actually be slower than a character loop.

Dar
Unicode Sophomore


_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution


Devin Asay
Humanities Technology and Research Support Center
Brigham Young University

_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to