Alexey Pechnikov <pechni...@mobigroup.ru>
wrote: 
> Yes, the BOM is on the original string. But with ICU collation we can
> see that 17 symbols string is equal to 16 symbols string. I think
> this result is not right.

What's the basis for this belief? It's not at all uncommon for two Unicode 
strings of different length (in codepoints) to collate equal - for example, 
they could be canonically equivalent but in different normalization forms, or 
contain weightless characters such as a zero-width non-breaking space (U+FEFF), 
also known as BOM.

> May be automatically dropping the BOM for
> ICU collated fields is more correct way.  

Why don't you do just that in your application?

Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to