In our previous episode, Mattias Gaertner said: > > Let's try to understand first why do you insist on the "UTF-8" in the name ? > > > > Maybe "UTF-8 aware" is better, if you really want the UTF-8 in the name. > > Maybe there is a misunderstanding. At least I can't follow you here. > > I started the thread about ParamStr, which only supports the system > codepage. I would like to improve it so that it supports > DefaultSystemCodepage. Or at least add an Unicode version of > ParamStr.
And the 2-byte unicode version exists, in unit uuchar. (the "objpas" of $mode delphiunicode). For now, simply make a utf8 wrapper that returns an utf8string. > Some people has called the RTL with UnicodeString the "Unicode RTL", > and the Ansistring RTL with system codepage "Ansi RTL". Well, that's what Windows calls them that (-W are unicode, -A are ansi). Delphi follows that terminology (D2007- being ansi, D2009+ being unicode). > I thought "UTF8 RTL" is analog, short and unambiguous. > Obviously I was wrong. The RTL leans somewhat to the prefered encoding on each target, so 1-byte on *nix and 2-byte on Windows. That means that there is no real utf8 support on Windows other than generic codepage aware string type (that goes for all 1-byte encodings). Setting defaultsystemcodepage will make all autoconversions to ansistring(0) return utf8, so also when calling e.g. unit windows functions. I think it would be very wise to be careful with that, and have an extensive trial period. You might want to keep the current -utf8 routines as mere codepage correcting wrappers. > (And, yes, I know, that all three names "Unicode|Ansi|UTF8 RTL" are > not 100% correct from a technical point of view.) The filesystem routines are now encoding agnostic. But that assumes you use a type that the compiler knows the associated encoding. But filesystem routines are only a small part of the system libraries. _______________________________________________ fpc-devel maillist - [email protected] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
