Re: [Lazarus] Adding codepage-support to the RTL (making LConvEncoding obsolete)

Marc Weustink Sat, 04 Dec 2010 06:12:47 -0800

On 3-12-2010 21:51, Guy Fink wrote:

In some languages some unicode codepoints have different
uppercase/lowercase pair. In example "i" in english (and most
others) region is uppercased to "I" while in Turkish it is
"I"+Upperdot (i can not write it here).


Take a look over: "Why Applications Fail With The Turkish
Language" at
http://www.i18nguy.com/unicode/turkish-i18n.htm


There is no information on the language in a string, even not in a
Unicodestring. So it is impossible to react on this point here.

IMO there is no need to have a language encoded in the string. Stringswon't get autoconverted to upper/lowercase. It's always a user call toUpper/Lowercase(S)

The uppercase/lowercase tables have been generated purely on the
official Unicode-Character-Description. Characters having a "SMALL"
in their description are replaced by the one having "CAPITAL" on that
place and vice-versa. (only if the counterpart exists) You can't do
more on this level. Please feel free to implement the functionality
you mention, I'll be sure it will be appreciated.

To take the Language into account when converting, functions likeUpper/Lowercase should have a 2nd optional parameter indicating for whatlanguage the conversion should be done.THen the default conversion still can take place, but based on thespecified language, the exceptions can be implemented (if there anrentmany exceptions, only a simple case will do)


Marc

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] Adding codepage-support to the RTL (making LConvEncoding obsolete)

Reply via email to