Yes, but most proposals here about a TCharacter are a bit overkill. In
example languare reference for a given char is not very important from
a Unicode point of view, unicode focuses its power in the text, so
locale is important in context operations and collations.

See my other post above.

Locale should really have nothing to do with the text/string business.

Instead, it should only refer to oddities such as decimal number representations, thousands separators, date and time strings etc.

Packing the language into the 'locale' info is an abuse IMO, unless it refers to such things as what kind of help file it should display to the user or the actual strings on menu items (resources) etc.

 From my point of view the compiler basic types must keep being
"basic", so be fast, no more than needed memory eaters and so on.

Please don't get resented, but this kind of attitued is verging on being offensive..

Instead of looking at the issue from POV of "I don't need it" or "It requires more hardware resources", can't you try to evaluate the need on its own merit.

And, if you still think that you will never need it, please remember that you dont have to --but others may.

Bring Unicode "power" to the basic string type is overkill, any
Unicode operation will be in the better case double time consumer, and
some of them 40-50 times slower. A simple collation will take at least
4 times the memory needed by the string itself and for most sort
algorithms needs the collation is unnecesary.

So?

What if it is a fact of life?

Such as 24-bit graphics. We all know it takes a lot more resources and that only patsies need that much color; we ended up using it.

Cn't you consider this unicode caharacter in the same light? (no pun).

So think in a "new" user
filling a TStringList with 1000 strings and invoking the Sort method,
as the strings are Unicode they must be ordered using the locale
collation or the general collation and finally saying "20 seconds to
sort 1000 strings!!!!, this looks even worst than javascript!!!!".

No. This is where you are mistaken, I' afraid.

A TUnicodeStringList can contain strings from different collations and one 'locale' information will be useless in sorting out that mess. You need 'language' information in each of those strings to be able to properly sort that unicode list.

Maybe, again from my point of view, it is more logical to create
"TTextUnicodeChar" and "TTextUnicodeString" classes which handle
Unicode textual data, not Unicode data.

I can't see how you can do that. I can't see how we can cater for unicode data (not textual data, as you put it) in aything other than a specific class [or data type]

PS: As one of the problems of Unicode support is the big amount of
data that must be stored (in exe or external file) is there any
recommended way to code, that unused arrays are left out when the
function that uses that array is never been called in the main program
?

Storage is a completely different problem. You could use, say, UTF-8 encoding and store also the language information when necessary.
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to