Re: [fpc-devel] String and UnicodeString and UTF8String

LacaK Tue, 11 Jan 2011 01:20:27 -0800

I think at most two are required for any target: unicodestring (D2009 
compatibility), and if really necessary because somehow the unicodestring 
version causes too much overhead, an ansistring($ffff) version as well. That's 
only for the classes though, I think most of the base RTL can be simply 
ansistring($ffff).

So if I understand correctly, then UnicodeString and also AnsiStringtypes must "be extended" that they will hold also information aboutactual codepage (encoding) of string data they hold.(AFAIK ATM they hold only information about "reference count" and "size"and of course "data")

I am not expert, so I do not understand all aspect/problems which arejoined with proper string handling, but some kind of implicitconversions (based on actual encoding of string data) is necessary (ANSI<-> UTF-8 <-> UTF-16 <-> ANSI ... etc.).

For example known problem with Euro currency symbol. In Windows is inCurrencyString global variable stored using ANSI codepage, but used inLCL (which expect UTF-8 encoding) without any explicit conversion, whatleads to displayng "?" instead of "€" (for example in TDBEdit or TDBGrid)

Another problem when displaying character data in data-aware databasecontrols (TDBEdit, TDBGrid). Data-aware controls (LCL) reads data fromTField descendatns (FCL) using TField.Text property which returns"string" (without codepage information is not clear if it is AnsiStringor UTF8String or UnicodeString). LCL expect UTF-8 strings, but it is nottrue in all cases (for example in case of ODBC)


-Laco.
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] String and UnicodeString and UTF8String

Reply via email to