On Sun, 15 Apr 2012 14:25:51 +0200 Martin Schreiber <mse00...@gmail.com> wrote:
> On Sunday 15 April 2012 13:33:05 Mattias Gaertner wrote: > > On Sun, 15 Apr 2012 13:00:12 +0200 > > > > > I assume Lazarus uses that string type everywhere where it expects utf-8, > > > same as MSEgui uses msestring (=UnicodeString) everywhere it expects > > > utf-16? > > > > UnicodeString has a clear advantage versus WideString (reference > > counting). > > > > I don't see the clear advantage of UTF8String. > > Using UTF8String instead of String (CP_UTF8 or CP_ACP) forces strings > > to CP_UTF8. This may slow down some assignments, may speed up some > > assignments or break some assignments. > > Now I don't understand, sorry. :-) No problem. The codepage strings are very new and I don't know yet all the details neither. So maybe some of my information is outdated or will soon be outdated. > UTF8String = type AnsiString(CP_UTF8), is there another string type with > CP_UTF8? What is the definition of "string" in cpstrnew? AnsiString(CP_ACP)? > http://wiki.freepascal.org/FPC_Unicode_support does not answer the question > AFAIK. This page is pretty outdated. I guess the fpc cpstrnew developers will update it when the dust has settled down. > Hmm, I checked the Lazarus source, it seems I was wrong with the assumption > that Lazarus uses "UTF8String" everywhere, it uses "String" instead, correct? Yes. > Example: > type > TTranslateString = type String; > TCaption = TTranslateString; > > TControl = class(TLCLComponent) > [...] > property Text: TCaption read GetText write SetText; > > TCustomEdit = class(TWinControl) > [...] > property SelText: String read GetSelText write SetSelText; The cpstrnew adds to every ansistring a codepage. This codepage is like "length" and "reference count": it can be changed at runtime. This is usually done by assigning it to another string. For example: var s: string = 'a'; writeln(StringCodePage(s)); // writes 0 = CP_ACP var u: utf8string = 'a'; writeln(StringCodePage(u)); // writes 65001 = CP_UTF8 With -Fcutf8 and without. Assigning utf8string to a string: s:=u; writeln(StringCodePage(s)); // writes 65001 = CP_UTF8 Assigning a string (CP_ACP) to utf8string: s:='a'; u:=s; // auto convert CP_ACP to CP_UTF8 writeln(StringCodePage(u)); // writes 65001 = CP_UTF8 Basically if you use "utf8string" you get a string that forces UTF-8. Mattias -- _______________________________________________ Lazarus mailing list Lazarus@lists.lazarus.freepascal.org http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus