On 17 Nov 2009, at 5:52pm, Igor Tandetnik wrote:

> But for your goals, it has to be sortable, right? In a proper Unicode 
> collation, U+0041 U+0301 would behave quite differently from U+0301 U+0041. 
> Consider "A ' E" (where ' stands for a combining acute accent). In most 
> locales, this would sort between AE and BE. Now, if we reverse it naively, 
> we'll end up with "E ' A", with the accent now attached to E and not A. The 
> result would sort between EA and FA, rather than between EA and EB as you 
> would probably want.

Obviously, your routine to reverse a string must be unicode-aware.  First split 
the string into characters, then reassemble them in reverse order.  So it will 
have to understand what 'character' means with regard to unicode and understand 
the unicode rules for combination characters.  The only way to do this properly 
is to find a unicode library licensed along similar lines to sqlite and 
integrate it.  If one doesn't exist, one needs to be written.  This is not a 
trivial task.

This revisits one of the recurring responses to my thread about what people 
want added to sqlite, and that answer boiled down to unicode-awareness rather 
than unicode-compatibility.

Simon.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to