Hello! On Tuesday 08 December 2009 01:07:54 Igor Tandetnik wrote: > Alexey Pechnikov <pechni...@mobigroup.ru> > wrote: > > Yes, the BOM is on the original string. But with ICU collation we can > > see that 17 symbols string is equal to 16 symbols string. I think > > this result is not right. > > What's the basis for this belief? It's not at all uncommon for two Unicode > strings of different length (in codepoints) to collate equal - for example, > they could be canonically equivalent but in different normalization forms, or > contain weightless characters such as a zero-width non-breaking space > (U+FEFF), also known as BOM.
The normalization is now performed by any string operation. But more fast and useful to do it once at data store. > > > May be automatically dropping the BOM for > > ICU collated fields is more correct way. > > Why don't you do just that in your application? Yes, I fix it in my application, but this problem can be produced in any application. Best regards, Alexey Pechnikov. http://pechnikov.tel/ _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users