On 30 Nov 2009, at 1:58am, Igor Tandetnik wrote:

> Note that Unicode collation is not as simple as you might think. Did you know 
> that in Estonian, 'y' sorts between 'i' and 'j'? Or that in German phonebook 
> sort, 'oe' sorts as if it were a single letter between 'o' and 'p'? 
> Basically, your simplistic approach would only work for plain unaccented 
> Latin letters and English collation rules.

I spent a lot of time annoyed about this, and ended up deciding that the only 
way to do Unicode sorting correctly is to store the language (or collation 
method) with each piece of Unicode text.  Of course, this still gives you the 
problem of working out which order two pieces of text go in if they are in two 
different languages.  Perhaps you also need a 'default language' marker for the 
entire column.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to