Because e.g. on the ext3 file system, you can have two files with the
name "ü" in the same directory. One named using the single character
"ü" and one named using as the string "u¨" (both in utf-8). If you
make the compiler automatically normalise everything, you lose
information (and get the security holes etc).
I see, but as this is not handled decently with good old ANSIStrings,
anyway, there is not "friendly old school" way that a compiler would be
able to offer. In these special cases, the user of course needs to
explicitly handle the upgrade of his project to unicode.
OTOH, in this special case, I don't see why the compiler should
"normalize" "u¨" to "ü". If the software is supposed to be handling
unicode, the unicode string "u¨" should be considered a perfectly legal
two-code-point information consisting of a "u" (a single sub-code in
UTF-8) and a double-dot (supposedly two subcodes in UTF-8). If the user
wants to handle this as a single "ü", he should write appropriate code
for that. Any automation on that is dangerous.
-Michael
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel