On 3-12-2010 21:51, Guy Fink wrote:
In some languages some unicode codepoints have different
uppercase/lowercase pair. In example "i" in english (and most
others) region is uppercased to "I" while in Turkish it is
"I"+Upperdot (i can not write it here).

Take a look over: "Why Applications Fail With The Turkish
Language" at
http://www.i18nguy.com/unicode/turkish-i18n.htm

There is no information on the language in a string, even not in a
Unicodestring. So it is impossible to react on this point here.

IMO there is no need to have a language encoded in the string. Strings won't get autoconverted to upper/lowercase. It's always a user call to Upper/Lowercase(S)

The uppercase/lowercase tables have been generated purely on the
official Unicode-Character-Description. Characters having a "SMALL"
in their description are replaced by the one having "CAPITAL" on that
place and vice-versa. (only if the counterpart exists) You can't do
more on this level. Please feel free to implement the functionality
you mention, I'll be sure it will be appreciated.

To take the Language into account when converting, functions like Upper/Lowercase should have a 2nd optional parameter indicating for what language the conversion should be done. THen the default conversion still can take place, but based on the specified language, the exceptions can be implemented (if there anrent many exceptions, only a simple case will do)

Marc

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to