Hello,

I have encountered a peculiar behaviour in FPC string comparison functions (UnicodeSameStr, UnicodeTextStr, UnicodeCompareStr, and Wide* variants). Basically, on some systems "0-0" and "-00" strings are considered to be same.

FPC string comparison functions use the WideStringManager, which I believe calls Win32CompareUnicodeString/Win32CompareWideString on Windows, which themselves call CompareStringW function (WinAPI). I was already aware of some Unicode equality rules for various locales, like "sharp s" (U+00DF) and "ss", but "0-0" and "-00" took me by surprise. As I researched more, it became apparent that the "-" (dash, minus, hyphen) symbol can be completely ignored by the CompareStringW function. Stranger yet, it affects only some systems, despite having configured the same locale and region.

I found several relevant articles which talk about the peculiarities of CompareString function:
http://archives.miloush.net/michkap/archive/2005/05/05/414845.html
http://archives.miloush.net/michkap/archive/2007/09/20/5008305.html

Question 1:
Is if ok for those FPC functions to treat strings like "0-0" and "-00" as same?

Question 2:
Can the inclusion of SORT_STRINGSORT flag in CompareString function fix this peculiar behaviour, and should this be included in FPC?

Regards,
Denis

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to