Re: [Lazarus] cwstring in arm-linux

Žilvinas Ledas Wed, 19 Oct 2011 13:52:28 -0700

Hello,

On 2011-10-19 21:03, Felipe Monteiro de Carvalho wrote:

On Wed, Oct 19, 2011 at 6:33 PM, Martin Schreiber<[email protected]>  wrote:

Does it use locale specific collation in PasUnicodeCompareStr and
PasUnicodeCompareText?

Good point, no, not yet. But this affects only turkish, azeri and
lithuanian AFAIK


Adding turkish and azeri is trivial, because UTF8LowerCase supports
them, but I did not understand yet the rules for Lithuanian, they are
quite convoluted, depend on nearby chars and stuff like that.

I am native Lithuanian so I think can help at least providing info, butI must understand what is the problem first.Do I understand correctly, that "collation" means "sorting order"? Inthat case Lithuanian does not depend on near by characters.

There are 32 letters and they follow this order:

Aa < Ąą < Bb < Cc < Čč < Dd < Ee < Ęę < Ėė < Ff < Gg < Hh < Ii < Įį < Yy< Jj < Kk < Ll < Mm < Nn < Oo < Pp < Rr < Ss < Šš < Tt < Uu < Ųų < Ūū <Vv < Zz < Žž

And there are some accented characters which are used only in linguistictexts (for example, dictionaries). (All list is here:http://developer.mimer.com/charts/lithuanian.htm)

The funny thing is that in dictionaries when "sorting" words, "Aa" and"Ąą" (also: "Ee" and "Ęę" and "Ėė"; "Ii" and "Įį" and "Yy"; "Uu" and"Ųų" and "Ūū") are treated as the "same letter".BUT, for example words "šieną" <> "sieną" <> "sieną" - all three aredifferent words (no accents in these characters).BUT I believe that accented characters should be treated as the sameletter: "šiẽną" = "šieną"; "siena" = "síena", because it is the sameword (accents do not change word meaning and are totally not required tobe provided by the text writer).

I don't know if I managed to explain anything, but if you'll need somehelp with Lithuanian language - feel free to contact me.



Regards,
Žilvinas Ledas

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] cwstring in arm-linux

Reply via email to