Re: [fpc-devel] Unicode support (yet again)
On 14/09/2011 03:56, Luiz Americo Pereira Camara wrote: I propose that the above behavior be implemented as a type named RTLString The Object Pascal language already has enough damn string types. I really don't think we should be adding fuel to the fire, by adding yet more string types! So the RTL under unix will have functions compiled with UTF8 strings giving no overhead interacting with native API The RTL under Windows will have compiled functions with UTF16 strings giving no overhead with native API That's exactly what I said. If a program is pass a UnicodeString to a RTL function under Windows no conversion is made When this same program is compiled under unix the UnicodeString should be converted to UTF8 automatically using the encoding info of the string No, why must unix environments take a performance hit?? This is not needed if UnicodeString is really what the same suggests. Any unicode type string. Unicode standard is defined as UTF-8, UTF-16 and UTF-32. So UnicodeString should really be any of those encodings - living up to it's name. If FPC has true unicode support, then all functions should work correct with just the UnicodeString type. That type's encoding is based on the native encoding of each platform. NO performance hit required. I'd even be happier if UnicodeString was dropped too, and String becomes unicode enabled. One less string type to worry about. String could be define as follows... [ignore the syntax] IFDEF unix String = String(utf8); ENDIF IFDEF windows String = String(utf16) ENDIF IFDEF OldDelphi String = AnsiString // of if some String(xxx) could be used ENDIF Then if you wanted your project to use some other specific encoding, then you can simply define your own string type and use that. The various string types know what encoding they are in, so auto-conversion is possible too (with possibility of data loss in case of unicode - ansi) eg: type { say I want to use UTF-32 in my apps for some reason } TfpgString = String(utf32); var s: String; // as defined above - could be utf8, utf16 etc.. m: TfpgString; a: AnsiString; begin m := 'Hello world!'; s := m; // automatic conversion happens here a := s; // auto conversion, with data loss (compiler warning) end; Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Tue, Sep 13, 2011 at 9:23 PM, Michael Van Canneyt mich...@freepascal.org wrote: Current strategy on fpc core seems to be to have 2 RTLs: One with unicode string, one with ansistring. Isn't that somewhat nasty for people currently using UTF-8? I mean, lets say that we can divide everyone using FPC into 3 groups: 1st People using ansi that don't want to change any line of code - They get a path forward with this proposal, even if temporary (the Ansi half of the RTL really seams like the definition of deprecated to me) 2nd People using UTF-8 - They get no love at all and can choose from using the old RTL with no Unicode and put some tape to fix some holes or migrate to something incompatible. 3rd People that want to use UTF-16 - They get a new RTL to move forward But how many percent of FPC users, libraries and applications are on each group? 1st I really can't imagine anyone who would want to stay stuck to the pre-Unicode world forever... 2nd The vast majority of users, libraries and applications through Lazarus 3rd msegui and possibly Delphi 2009+ users Lazarus is by far the most widely way to use FPC, so I would guess that the group 2 has more then 75% of all users, and still it gets no love at all. Which real path forward is provided for these users? Of course one path is migrating everything, the LCL, the IDE, SynEdit, all packages, etc, to UTF-16, but that's a huge, immense work with zero advantages over what we are doing up to now, it's just migrate to migrate, who will be motivated to do that? My point is that it is not very reasonable to migrate so much working code for no advantage at all, so the Unicode RTL could provide something to easy interfacing with UTF-8, for example: * overloaded versions of routines and methods for utf8string * A TStrings and TStringList for utf8 These would need to be ifdefed so they are not present in the Ansi RTL. Without even a TStrings for utf-8 one cannot really expect Lazarus to be able to use the Unicode URL without doing a full migration to UTF-16 ... My final point is just: why not? If code in the RTL could fix things for Lazarus why impose the need to migrate so much working code? If the Unicode RTL provides UTF-8 support too then Lazarus projects could be migrated by just doing 2 things: 1 Change all places which use TStrings and TStringList to TStringsUTF8 and TStringListUTF8 2 Change all places which add utf-8 to ansi conversions to the RTL with no conversion at all On the other hand if we have no path forward except for migrating to UTF-16 I can imagine we will still be talking about how to move forward in 5 years from now... -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 5:50 AM, Martin Schreiber mse00...@gmail.com wrote: Linux expects an array of bytes in filenames (no encoding, no utf-8) AFAIK. That's a nice theory, but: All Linux distributions that I know use utf-8 Android uses utf-8 Meego uses utf-8 So, do you have any concrete example of new releases of Linux using something different from UTF-8 for filenames? -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, 14 Sep 2011, Felipe Monteiro de Carvalho wrote: On Tue, Sep 13, 2011 at 9:23 PM, Michael Van Canneyt mich...@freepascal.org wrote: One with unicode string, one with ansistring. They will have the same code, but will be compiled twice, each time with a different compiler define to decide which version it must be. Is this possible in UNIX? I can see that in Windows you can use the trick to use W versions which are identical except for the string type and drop Windows 9x support, but is this really possible for the UNIX syscalls? They expect UTF-8 not UTF-16 which is what UnicodeString uses. And why would this not be possible ? Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, 14 Sep 2011, Felipe Monteiro de Carvalho wrote: On Tue, Sep 13, 2011 at 9:23 PM, Michael Van Canneyt mich...@freepascal.org wrote: Current strategy on fpc core seems to be to have 2 RTLs: One with unicode string, one with ansistring. Isn't that somewhat nasty for people currently using UTF-8? No, why do you think so ? They should use the unicode version. All will work as-is. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 8:59 AM, michael.vancann...@wisa.be wrote: No, why do you think so ? Well, at the very least: 1 All var parameters from the RTL will no longer be directly usable with UTF-8 strings http://www.freepascal.org/docs-html/rtl/sysutils/appendstr.html How can I pass a UTF-8 string to AppendStr in the Unicode RTL? The Example62 from the docs will no longer compile =D 2 TStrings will be in a different encoding from the rest of the LCL, this will surely be very nasty. MyForm.Caption := MyStrings.Strings[9]; and you get an encoding conversion... Basically it will be a salad of automatic conversions done by the compiler... 3 FileOpen(MyForm.Caption, whatever_mode); You get first utf-8 - utf-16 to call FileOpen and then FileOpen does utf-16-utf-8 on UNIXes -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, 14 Sep 2011 08:50:22 +0200 Felipe Monteiro de Carvalho felipemonteiro.carva...@gmail.com wrote: On Wed, Sep 14, 2011 at 5:50 AM, Martin Schreiber mse00...@gmail.com wrote: Linux expects an array of bytes in filenames (no encoding, no utf-8) AFAIK. That's a nice theory, but: It's more than theory. You can use file names under Linux that are no valid UTF-8. At work I see it every week. All Linux distributions that I know use utf-8 Android uses utf-8 Meego uses utf-8 So, do you have any concrete example of new releases of Linux using something different from UTF-8 for filenames? Mattias ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 13/09/2011 21:23, Michael Van Canneyt wrote: Current strategy on fpc core seems to be to have 2 RTLs: One with unicode string, one with ansistring. Can you clarify a bit. When you say unicode string to you mean UTF-16 (Delphi's definition of a unicode string), or do you mean a Unicode string in the true sense - it can be utf-8 or utf-16 etc depending on the platform's native encoding. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] who can explain why array of const can't be passed to another array of const
On 14 Sep 2011, at 04:15, Paul Ishenin wrote: If I change cdecl to stdcall in g_object_dosomething then it compiles with no error. For me it is strange. Should developer care about internal compiler representation of an array of const for different conventions? It's more that even though both are called array of const, they are completely different things. They also don't support the same types. Imo this is a compiler task. I've checked the same on delphi XE and there it compiles. So whether this is 1)a bug 2)unimplemented feature 3)desired compiler behavior? You could say it is an unimplemented feature, but implementing it would require a lot of assembler code that's different for every architecture (and in some cases also for different OSes, since not all OSes use the same ABI and the ABI defines how C varargs must be passed). It is not a bug since the error message is given on purpose. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Am 14.09.2011 07:50, schrieb Felipe Monteiro de Carvalho: On Wed, Sep 14, 2011 at 5:50 AM, Martin Schreibermse00...@gmail.com wrote: Linux expects an array of bytes in filenames (no encoding, no utf-8) AFAIK. That's a nice theory, but: All Linux distributions that I know use utf-8 Android uses utf-8 Meego uses utf-8 So, do you have any concrete example of new releases of Linux using something different from UTF-8 for filenames? Some Samba shares for example and there still are many old Linux systems in the wild. Anyway, I simply wanted to remember the fact. Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14.09.2011 09:08, Martin Schreiber wrote: Am 14.09.2011 07:50, schrieb Felipe Monteiro de Carvalho: On Wed, Sep 14, 2011 at 5:50 AM, Martin Schreibermse00...@gmail.com wrote: Linux expects an array of bytes in filenames (no encoding, no utf-8) AFAIK. That's a nice theory, but: All Linux distributions that I know use utf-8 Android uses utf-8 Meego uses utf-8 So, do you have any concrete example of new releases of Linux using something different from UTF-8 for filenames? Some Samba shares for example and there still are many old Linux systems in the wild. Anyway, I simply wanted to remember the fact. Another good example: FAT. I'm now as far to avoid umlauts and such when I copy files from Linux to FAT or the other way round, because with the default mount settings they are invalid characters in one of the two... (and I didn't yet bother to fiddle around with that ^^) Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, 14 Sep 2011, Felipe Monteiro de Carvalho wrote: On Wed, Sep 14, 2011 at 8:59 AM, michael.vancann...@wisa.be wrote: No, why do you think so ? Well, at the very least: 1 All var parameters from the RTL will no longer be directly usable with UTF-8 strings http://www.freepascal.org/docs-html/rtl/sysutils/appendstr.html How can I pass a UTF-8 string to AppendStr in the Unicode RTL? The Example62 from the docs will no longer compile =D That depends on what the compiler will do for you :-) 2 TStrings will be in a different encoding from the rest of the LCL, this will surely be very nasty. MyForm.Caption := MyStrings.Strings[9]; and you get an encoding conversion... That will always be the case, even if we decided for UTF-16. Basically it will be a salad of automatic conversions done by the compiler... This will be so in each case where different codepages or encodings are used. 3 FileOpen(MyForm.Caption, whatever_mode); You get first utf-8 - utf-16 to call FileOpen and then FileOpen does utf-16-utf-8 on UNIXes Once more, why do you think so ? In each case: 1. It will be messy whatever we do. Thinking there is an easy migration path is wishful thinking. 2. Backwards compatibility is a big concern. Code that compiled and worked should compile and work. Michael. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] who can explain why array of const can't be passed to another array of const
On Wed, Sep 14, 2011 at 19:03, Jonas Maebe jonas.ma...@elis.ugent.be wrote: It's more that even though both are called array of const, they are completely different things. They also don't support the same types. Perhaps varargs-compatible parameter type should be called something else then? -- Alexander S. Klenin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 9:45 AM, Mattias Gaertner nc-gaert...@netcologne.de wrote: It's more than theory. You can use file names under Linux that are no valid UTF-8. At work I see it every week. In this case then for sure we cannot only have file routines only in UTF-16, because that would make it impossible to identify many files in Linux... -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] who can explain why array of const can't be passed to another array of const
On 14 Sep 2011, at 10:40, Alexander Klenin wrote: On Wed, Sep 14, 2011 at 19:03, Jonas Maebe jonas.ma...@elis.ugent.be wrote: It's more that even though both are called array of const, they are completely different things. They also don't support the same types. Perhaps varargs-compatible parameter type should be called something else then? Both backwards and Delphi compatibility stand in the way of that. Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 10:46 AM, michael.vancann...@wisa.be wrote: Can you clarify a bit. When you say unicode string to you mean UTF-16 (Delphi's definition of a unicode string), or do you mean a Unicode string in the true sense - it can be utf-8 or utf-16 etc depending on the platform's native encoding. This has not yet been decided. IMHO a platform-dependent string would be the worse solution of all ... far worse then migrating to UTF-16. It adds tiny bit of speed while it puts a large development complexity burdain ... I imagine how one would explain that kind of thing to newbies ... Just recently I had a student from my university implement a routine which converts HTML text from utf-8 to braille in utf-8 ... I didn't have to explain anything and she could implement it without Pascal previous experience. I wonder if I had to say: ops, the main string type is unknown =D To do any operation on it you need to first convert to something known and then convert back to unknown, while unknown conversions might take place ... -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
In our previous episode, Felipe Monteiro de Carvalho said: Following from a discussion on mac-pascal, I'd like to propose a solution for Unicode support. First and for all. Backwards compat dropping is not going to happen. If we were planning that, we had changed everything to something unicode years ago. Function FileOpen (Const FileName : utf8string; Mode : Integer) : THandle; overload; Function FileOpen (Const FileName : unicodestring; Mode : Integer) : THandle; overload; and similarly for other places and everyone should be happy. This is not a solution. This is a temporary hack to alieve some perceived Lazarus pain, and doesn't fix my main gripe of the manual conversions everywhere. It is is a hack for 0.01% of the unicode problem. IMHO the objective should be to mimize manual conversions (and with that the fact that generic code becomes encoding specific). ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14/9/2011 03:40, Graeme Geldenhuys wrote: On 14/09/2011 03:56, Luiz Americo Pereira Camara wrote: I propose that the above behavior be implemented as a type named RTLString The Object Pascal language already has enough damn string types. I really don't think we should be adding fuel to the fire, by adding yet more string types! AFAIK RTLString already exist in the cpstr branch. Anyway is just a alias to a real type. [] String could be define as follows... [ignore the syntax] IFDEF unix String = String(utf8); ENDIF IFDEF windows String = String(utf16) ENDIF IFDEF OldDelphi String = AnsiString // of if some String(xxx) could be used ENDIF This is not desirable simply because at each platform (windows / unix) the user code of the same program will have a different encoding increasing the possibility of subtle errors. Some functions like string streaming requires the same encoding between platforms otherwise it will require code change to work properly. Another advantage of using RTLString as i proposed is that Lazarus will require almost no code change since the encoding of string in LCL will be the same (UTF8) across platforms. The conversion will take place only when interacting with the RTL Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
the feature request, that started the discussion [Re: Adding properties into existing stabs/dwarf; gdb readable workaround ? [[Re: [fpc-devel] Status and ideas about debug info (stabs, dwarf / dwar3)
Unfortunately, once about 2 mails are exchanged on the subject of what I actually tried to talk about, the whole discussion takes off and all kind of debugger woes are included So back again: I am trying to find out, if the below could make a reasonable feature request (and therefore have a chance to be implemented in FPC) And if it does = should I put in on mantis. I believe, Joost may actually have started to look at the requirements, since he enquired about gdb and method execution? So some points, that I would like to know: 1) I believe the general idea, of making a property Counter: Integer read GetCounter be encoded as a function of the object( in the same way as GetCounter already is) is acceptable? - So field properties are returning the field - Getter properties are depending on GDBs ability to execute functions. 2) Execution of that properties. (getter) I understand it depends on GDB, and FPC can probably not affect it much. As far as the dwarf debug info can have an influence (if at all), it would be nice, if execution was NOT automatic. e.g NONE of those would execute (property List: TList read GetList) Foo.List Foo.List.Counter The following may or may not: Foo.List().Counter 3) Any hint that a symbol is a property, not a field or function (despite it being encoded as field or function? I know there is an desire not to have any hacks/workarounds in FPC, and I understand the reasons. Yet, I was hoping, IF available, and effort is minimal, is there any chance at all? As i said, i don't know if DW_AT_sibling for example can be used (I included the dwarf spec below). It looks to me like it is a hint that can be used at the desire of the compiler (debug info provider): IF ... FEELS ... If using this flag does not conflict, or abuse the dwarf specs, then maybe it could be used? Even if gdb does not show it, it would mean that later means of access may exists, and the info is there, and an IDE can at least tell this is a property from dwarf 3 specs: In cases where a producer of debugging information feels that it will be important for consumers of that information to quickly scan chains of sibling entries, while ignoring the children of individual siblings, that producer may attach a DW_AT_sibling attribute to any debugging information entry. The value of this attribute is a reference to the sibling entry of the entry to which the attribute is attached On 12/09/2011 21:13, Martin wrote: On 12/09/2011 20:46, Joost van der Sluis wrote: On Mon, 2011-09-12 at 20:31 +0200, Jonas Maebe wrote: On 12 Sep 2011, at 20:20, Martin wrote: Could not properties mapping to a function be implemented the same way = normal functions are already listed in ptype so public property Counter: Integer read GetCounter could appear the same as the function GetCounter ? In that case at least the list of available symbols is complete. The only thing that then would need codetools involved was to check if the name is a property and not a function/field. That may be possible, yes. What is it that we actually need? At the Dwarf-level: Is the information that a property actually has a getter, and the name of that getter enough? Or do we want that when the value of a property is asked, the getter is called automagically? (And that there is some kind of flag that indicates that a getter is being used?) I don't think that we can add a stack-script in the DW_AT_Location that executes the getter. I've looked at DW_OP_call, but that won't help us here. Or, and maybe this is the best solution: some 'opaque' type that returns a reference to something else. Which can be different for reading and writing values... There are 2 conflicting desires. -data-evaluate-expression FooObject.BarObjProp.BarValue ptype FooObject / ptype FooObject.BarObjProp The first only works, ( at current) if it is a field, not a getter function. IMHO that is ok. While alot of people do want code execution for properties, there must be a mean of control (in the front end, e.g lazarus). Even if that was enabled by default. That means, I would like that gdb does *not* automatically call the function. So for data evaluation we are fine. If it is a function, the expression fails, and the IDE needs to look into it. Well having said that. If the function was only called, if brackets are supplied, maybe. -data-evaluate-expression FooObject.BarObjProp().BarValue But it is not a must. I am not even sure if desirable. the 2nd issue is knowledge that a) a there is something in the object under the name of the property b) this something happens to be a property a) is already fulfilled if it is a field-property. Hence I asked, if functions could be added the same way. -data-evaluate-expression FooObject.GetCounter currently gets no value -data-evaluate-expression FooObject.Counter gives an error, no symbol if Counter could be the same as GetCounter (making it
Re: [fpc-devel] Unicode support (yet again)
On 14/9/2011 03:48, Felipe Monteiro de Carvalho wrote: [..] Of course one path is migrating everything, the LCL, the IDE, SynEdit, all packages, etc, to UTF-16, but that's a huge, immense work with zero advantages over what we are doing up to now, it's just migrate to migrate, who will be motivated to do that? My point is that it is not very reasonable to migrate so much working code for no advantage at all, so the Unicode RTL could provide something to easy interfacing with UTF-8, for example: * overloaded versions of routines and methods for utf8string * A TStrings and TStringList for utf8 Using the approach i described (RTLString) in other mail this (massive LCL code change) is not required. Probably just load from file functions like TStrings etc. Lazarus/LCL could stay as is (UTF8) and would work as today: Under unix: no conversion is done since the LCL and RTL encodings are the same Under Windows: conversion UTF8 - RTLString (UTF16) is done once These would need to be ifdefed so they are not present in the Ansi RTL. Without even a TStrings for utf-8 one cannot really expect Lazarus to be able to use the Unicode URL without doing a full migration to UTF-16 ... My final point is just: why not? If code in the RTL could fix things for Lazarus why impose the need to migrate so much working code? Because if someone for some reason, like porting Delphi code, stays with a UTF16 string, under windows, when using RTL functions TWO conversions will be made: User Code (UTF16) RTL (UTF8) WINAPI (UTF16) Always using the same encoding in RTL and Native API will keep the maximum conversion number at 01 Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 6:04 AM, Felipe Monteiro de Carvalho felipemonteiro.carva...@gmail.com wrote: On Wed, Sep 14, 2011 at 10:46 AM, michael.vancann...@wisa.be wrote: Can you clarify a bit. When you say unicode string to you mean UTF-16 (Delphi's definition of a unicode string), or do you mean a Unicode string in the true sense - it can be utf-8 or utf-16 etc depending on the platform's native encoding. This has not yet been decided. IMHO a platform-dependent string would be the worse solution of all ... far worse then migrating to UTF-16. It adds tiny bit of speed while it puts a large development complexity burdain ... I imagine how one would explain that kind of thing to newbies ... Why would the internal enconding of a RTLString/UnicodeString have to affect any effect how you program if the RTL/API is done right? -Flávio ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 11:32 AM, Luiz Americo Pereira Camara luiz...@oi.com.br wrote: Because if someone for some reason, like porting Delphi code, stays with a UTF16 string, under windows, when using RTL functions TWO conversions will be made: User Code (UTF16) RTL (UTF8) WINAPI (UTF16) This would not happen because I proposed to have 2 versions of the routines in the RTL. Not 1 UTF-8 version. There would be both UTF-8 and UTF-16 versions and one would naturally use the one which matches his preferred encoding ... and the RTL would only convert the non-native version. -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Felipe Monteiro de Carvalho felipemonteiro.carva...@gmail.com hat am 14. September 2011 um 10:51 geschrieben: On Wed, Sep 14, 2011 at 9:45 AM, Mattias Gaertner nc-gaert...@netcologne.de wrote: It's more than theory. You can use file names under Linux that are no valid UTF-8. At work I see it every week. In this case then for sure we cannot only have file routines only in UTF-16, because that would make it impossible to identify many files in Linux... Well, many is a bit exaggerated. It does not happen on most Linux systems. And, yes, UTF-16 is not enough under Linux. But this is nothing new. This was explained several times on this list. See the many threads about unicode strings. Mattias ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14/09/2011 11:19, Luiz Americo Pereira Camara wrote: This is not desirable simply because at each platform (windows / unix) the user code of the same program will have a different encoding increasing the possibility of subtle errors. Why? Not every program is a text manipulation program or text parser. Most programs simply assign one string to another. eg: Button1.Caption := 'Click me'; lMyString := Button1.Caption; Under unix systems 'Click me', Button1.Caption and lMyString will be a UTF-8 encoded. Under Windows 'Click me', Button1.Caption and lMyString will be UTF-16 encoding. When Lazarus saves this information in a .lfm file, it will be stored as UTF-8 irrespective of the platform. This is normal behaviour on all platforms already, and already done in Lazarus too. As for streaming, the same applies as for saving to file. UTF-8 is ideally suited for (and was designed for simplifying) streaming, hence the W3C promotes the usage of UTF-8 in HTML, XML etc. Another advantage of using RTLString as i proposed is that Lazarus will require almost no code change since the encoding of string in LCL will be the same (UTF8) across platforms. Lazarus, like fpGUI will have to decide what they want to do. Stick to having UTF-8 forced on all platforms, or use a native encoding on each platform. Currently UTF-8 was choosen in both project because it is so compatible (think easy here) with AnsiString - so least amount of work was required and it was pretty efficient because most programs already used AnsiString. If I was to change fpGUI to use a native encoding on each platform, I would simply change my definition of TfpgString as described in a similar example before. All string manupulation inside fpGUI (and LCL) should already have adhered to the rule that 1 byte 1 character, so the rest of the framework should continue to work as normal. In the case of fpGUI, I would also be able to get rid of all the UTF8Copy(), UTF8Length() calls and simply use the RTL Copy() and Length() functions again - after all, they were only introduced because FPC's RTL lacked Unicode (any encoding) support. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
In our previous episode, Felipe Monteiro de Carvalho said: * Make file-handling routines which take filenames as parameters from the RTL modular so that the LCL can implement them with UTF-8 support. This plus a UTF-8 widestring manager and the Ansi RTL can be fully UTF-8. I'm not as opposed to this as to the other. At least the interfaces stay the same. But again, that is no unicode solution, just minor damage control to make the current situation bearable. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 11:53 AM, Marco van de Voort mar...@stack.nl wrote: * Make file-handling routines which take filenames as parameters from the RTL modular so that the LCL can implement them with UTF-8 support. This plus a UTF-8 widestring manager and the Ansi RTL can be fully UTF-8. I'm not as opposed to this as to the other. At least the interfaces stay the same. Yes, but this solution would only work in the Ansi RTL ... And why would the interfaces change in the other proposal? It is only 1 more overloaded option for the routines, it does not change anything which will use UnicodeString. -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14/09/2011 11:04, Felipe Monteiro de Carvalho wrote: IMHO a platform-dependent string would be the worse solution of all ... far worse then migrating to UTF-16. I don't see why? Use the RTL functions to manipulate your text strings. Both the string and RTL functions will use the same encoding on each platform - so no problems, no conversions. If you really needed to know the encoding, the RTL could include a helper function to tell you the encoding of any string (just like Delphi 2009+ has). Just recently I had a student from my university implement a routine which converts HTML text from utf-8 to braille in utf-8 ... I didn't Again, no problem. The HTML should have specified the encoding it is in. Normally that would be UTF-8. So under Linux, MacOSX etc it will already be in the native encoding. Under Windows, text is normally stored in UTF-8, contrary to UTF-16 being the encoding off the native Windows API. So loading the file you can compare the HTML file encoding to the current RTL encoding and do a conversion if needed (same as is required in Delphi). As for the text-to-braille functionality, that is outside the scope of the FPC and RTL. But common sense should prevail, use RTL string functions to implement your conversion - don't assume 1 byte = 1 character. A unicode aware string iterator could be implemented to help you step through the characters one at a time. Such a string iterator could even become part of the RTL as it will probably be used often for many parsers. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
In our previous episode, Felipe Monteiro de Carvalho said: And why would the interfaces change in the other proposal? It is only 1 more overloaded option for the routines, Which is just 1 more interface change. And for something that is a temporary workaround. That is what I like on Mattias proposal, it is mostly hidden in implementation, and the declaration and setting of the manager can have a lot of platform and deprecated directives around it to make it clear that it won't last, and it is not just for lazarus. I assume it is windows only. But I really do wonder if this is necessary, since it will already not make the 2.6 cycle anymore, and I hope 2.8 can be really unicode. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
In our previous episode, Martin Schreiber said: Is this possible in UNIX? I can see that in Windows you can use the trick to use W versions which are identical except for the string type and drop Windows 9x support, but is this really possible for the UNIX syscalls? They expect UTF-8 not UTF-16 which is what UnicodeString uses. Linux expects an array of bytes in filenames (no encoding, no utf-8) AFAIK. It is a bit agnostic yes. But does that really matter if all other programs write filenames in utf8 encoding? It might as well be specified to be utf-8 then, there is no difference in approach. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wed, Sep 14, 2011 at 12:03 PM, Marco van de Voort mar...@stack.nl wrote: Is this possible in UNIX? I can see that in Windows you can use the trick to use W versions which are identical except for the string type and drop Windows 9x support, but is this really possible for the UNIX syscalls? They expect UTF-8 not UTF-16 which is what UnicodeString uses. Afaik QT and many other higher level libs always use UTF-16. MSE does too. Might also be useful for the JVM port. I think I wasn't clear enough. I wanted to say that I don't see how you can have both a Ansi and a UTF-16 RTL in UNIXes with the same codebase, without ifdefs. I think this is not possible and one of the previous messages seamed to indicate that the RTL would be able to use the same codebase regardless of the output version (ansi vs utf-16), so without ifdefs. But it will be beneficial to everybody, and it is clear to everybody how something should behave, so there will be no endless bickering over details and workarounds like this thread. It is a structured approach. Well, there is still uncertainty over the question brought up by Graeme: Always UTF-16 or the unknown string type? BTW: I explained all this to you, including the not dropping legacy, over some Chinese food a few months ago. Don't you remember? Yes, I remember, but the way you spoke about it, it sounded something like a proposal, not 100% sure it would end up like this =) Now it is really put as the way forward... -- Felipe Monteiro de Carvalho ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 09/13/2011 04:52 PM, Hans-Peter Diettrich wrote: It's not the CPU, it's more the MMU which can help in finding changed (global) variables. AFAIK, the MMU can not work in byte addresses but just with much bigger blocks of data. So it does not seem to help with finding a write access to a dedicated variable. Moreover the MMU programming and interrupts will be consumed by the OS and a user space program can't even see it. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 09/13/2011 02:53 PM, Joost van der Sluis wrote: You do know that GDB does have a Pascal extension, right? IMHO, if we really can work with the gdb team on feeding the necessary Object-Pascal specific add-ons into gdb, creating a new debugger from scratch does not make any sense at all. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 09/13/2011 04:59 PM, Hans-Peter Diettrich wrote: IMO you're addressing the wrong audience. Most things, beyond breakpoint handling, stepping and memory read/writes, can be done outside the debugger. Such external code is not bound to debugger support, and can use language specific information (RTTI...). If this is true, why discussing replacing gdb ? -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 09/14/2011 08:50 AM, Felipe Monteiro de Carvalho wrote: All Linux distributions that I know use utf-8 Android uses utf-8 Meego uses utf-8 AFAIK, the EXT system does not care about the code the file-name byte-arrays are done in. only 0x00 (end of name) and '\' are interpreted. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 09/14/2011 10:51 AM, Felipe Monteiro de Carvalho wrote: In this case then for sure we cannot only have file routines only in UTF-16, because that would make it impossible to identify many files in Linux... Who says that file names are supposed to be human readable and this done in some character encoding ? AFAIK: With EXT they are just streams of up to 512 bytes (with 0x00 and '/' disallowed) With old style FAT they are just arrays of 11 bytes (maybe with 0x00, '.' and '\' disallowed) and ASCII characters of lower case and upper case identical With long filename FAT I fear it's quite complicated (e.g. short and long file name of a file need to be recognized as identical). But no unicode here. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 09/14/2011 11:05 AM, Marco van de Voort wrote: First and for all. Backwards compat dropping is not going to happen. It already has and supposedly can't be avoided. Take a look of what Lazarus was forced to make out of the identity of ANSIString and UTF8String seemingly forced by FPC. e.g.: Old programs assuming local ANSI 8 bit code retrieved from LCL GUI components, compiled with the new version don't work (e.g. if doing myChar := myString[3]; ) Doing My16BitString = 'my constant text containing umlauts as äöü'; provides an erroneous result. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 14.09.2011 12:44, Michael Schnell wrote: On 09/13/2011 04:52 PM, Hans-Peter Diettrich wrote: It's not the CPU, it's more the MMU which can help in finding changed (global) variables. AFAIK, the MMU can not work in byte addresses but just with much bigger blocks of data. So it does not seem to help with finding a write access to a dedicated variable. Moreover the MMU programming and interrupts will be consumed by the OS and a user space program can't even see it. But the debugger can ask the OS to write protect a page or to enable a page guard (which triggers on write access) and then the corresponding signal/exception can be catched. This reduces the checks necessary from the complete process memory down to only the page size. Note: I don't know whether it's implemented like that in any debugger, this is just a theory of mine. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 09/14/2011 01:58 PM, Sven Barth wrote: But the debugger can ask the OS to write protect a page or to enable a page guard (which triggers on write access) and then the corresponding signal/exception can be catched. This reduces the checks necessary from the complete process memory down to only the page size. Do you think this is possible without rewriting the OS (for all supported OSes) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Felipe Monteiro de Carvalho schrieb: On Tue, Sep 13, 2011 at 9:23 PM, Michael Van Canneyt mich...@freepascal.org wrote: One with unicode string, one with ansistring. They will have the same code, but will be compiled twice, each time with a different compiler define to decide which version it must be. Is this possible in UNIX? I can see that in Windows you can use the trick to use W versions which are identical except for the string type and drop Windows 9x support, but is this really possible for the UNIX syscalls? They expect UTF-8 not UTF-16 which is what UnicodeString uses. A few topics: The NT WinAPI (not 9x) *implements* everything in the Wide (UTF-16) routines, the Ansi versions do the string *conversion* before calling the Wide version. Unix API (most probably - dunno) has no such dual interface with internal conversion. The NT filesystems store names in UTF-16, while Unix filesystems store UTF-8. This means that access to an NTFS or FAT32 drive under Unix will require a string conversion, in the filesystem handler. On Windows, Ansi means any (byte-char) encoding, with different (national) codepages on every machine. This can cause trouble to Ansi applications (using Ansi strings), when filenames do not convert losslessly into that codepage. Unix IMO uses UTF-8 as the Ansi encoding, eliminating possible losses, and that's why FPC also prefers UTF-8 encoding. But let's not forget the user! Many users still want simple string handling, with direct mapping between logical and physical chars (SBCS). This is not possible at all with UTF-8, while UTF-16 works fine with the BMP, at least. This want of simple string handling suggests the use of UTF-16 for Unicode strings in *user* code. WRT the latter argument, FPC IMO should follow the Delphi implementation of Unicode strings as UTF-16. This choice is independent from the (platform dependent) RTL conventions, but it affects the standard components (string lists...) in the FCL, and the other components in the LCL. Here again the average user will prefer UTF-16 component libraries, compatible with his own code, while more experienced users may be happier with the current UTF-8 libraries. English (ASCII) users also may prefer UTF-8, as long as they do not have to (or want to) deal with strings in foreign languages. Once they have to face the existence of non-ASCII strings in their applications, they will most probably prefer switching to UTF-16, with few changes to their existing codebase and coding habits(!). Really *processing* Unicode text, with all its bells and whistles, is so complicated that it should be left to dedicated software and libraries, while typical application code will ignore everything beyond char level. IMO the number of required conversions is of little importance to the runtime behaviour of an application. File access is always expensive, so that a single conversion into the platform specific filename representation is not perceptible at all. The same for GUI components, which typically store all strings twice: once for their own (and application) use, and another copy in the widgets. Here again transfers of strings between widgets and components are rare, with neglectable slowdown by eventual conversions during message handling. More important IMO is the external storage of Unicode, where I see no reasonable way around UTF-8, considering codepage dependencies and UTF-16 byte-order problems. Another note: a set of char is quite incompatible with Unicode/UTF-16. This should be taken into account with *every* introduction of an Unicode string type. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Graeme Geldenhuys schrieb: If FPC has true unicode support, then all functions should work correct with just the UnicodeString type. That type's encoding is based on the native encoding of each platform. NO performance hit required. Can you specify, *which* strings ever *require* platform specific encoding? Beyond filenames and environment strings? UI strings (GUI, console) are more thightly bound to user code and component/widgetset libraries, than to a platform API (see my other comment) DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Michael Schnell schrieb: On 09/14/2011 11:05 AM, Marco van de Voort wrote: First and for all. Backwards compat dropping is not going to happen. It already has and supposedly can't be avoided. Take a look of what Lazarus was forced to make out of the identity of ANSIString and UTF8String seemingly forced by FPC. e.g.: Old programs assuming local ANSI 8 bit code retrieved from LCL GUI components, compiled with the new version don't work (e.g. if doing myChar := myString[3]; ) How many bytes must a char have, when it shall allow to store any (logical) character? Unicode users have no use for an char type, instead they have to use substrings for every logical character. A Unicode BMP user could be happy with a 2-byte char, of course, at his own (low) risk. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Graeme Geldenhuys schrieb: As for the text-to-braille functionality, that is outside the scope of the FPC and RTL. But common sense should prevail, use RTL string functions to implement your conversion - don't assume 1 byte = 1 character. A unicode aware string iterator could be implemented to help you step through the characters one at a time. Such a string iterator could even become part of the RTL as it will probably be used often for many parsers. How many users will have to deal with chars outside the Unicode BMP? IMO UTF-16 can make 99% of the (current) users happy with simple string handling, while the rest would prefer UTF-32 instead of UTF-8, because outside the BMP UTF-8 is a waste of space, and lacks indexed char access in any case. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
Sven Barth schrieb: But the debugger can ask the OS to write protect a page or to enable a page guard (which triggers on write access) and then the corresponding signal/exception can be catched. This reduces the checks necessary from the complete process memory down to only the page size. Note: I don't know whether it's implemented like that in any debugger, this is just a theory of mine. Every (reasonable) OS provides such features in its debug API. Available support depends on the actual hardware, of course. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On Wed, Sep 14, 2011 at 6:48 AM, Michael Schnell mschn...@lumino.de wrote: IMHO, if we really can work with the gdb team on feeding the necessary Object-Pascal specific add-ons into gdb, creating a new debugger from scratch does not make any sense at all. That's true. The only thing concerns me about that, is there's no really a standard in GDB (i can be wrong). But I've seen a lot of issues in Lazarus gdb-support, because of the different builds of GDB used. Also, IRC, Apple forked gdb (as well other gnu-tools) to make it usable for their own needs (iDevice debugging support). I'm not sure, if the latest changes of gdb are there in the Apple gdb, I would assume they're there. thanks, Dmitry ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] bounty: FPC based debugger
On 14.09.2011 14:53, Michael Schnell wrote: On 09/14/2011 01:58 PM, Sven Barth wrote: But the debugger can ask the OS to write protect a page or to enable a page guard (which triggers on write access) and then the corresponding signal/exception can be catched. This reduces the checks necessary from the complete process memory down to only the page size. Do you think this is possible without rewriting the OS (for all supported OSes) At least Windows allows to use page guards... I don't know about Linux though. Regards, Sven @Michael: Sorry, this mail wasn't meant to be private, but Reply to list put your mail address into the to field instead of the list's address. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14/9/2011 06:41, Felipe Monteiro de Carvalho wrote: On Wed, Sep 14, 2011 at 11:32 AM, Luiz Americo Pereira Camara luiz...@oi.com.br wrote: Because if someone for some reason, like porting Delphi code, stays with a UTF16 string, under windows, when using RTL functions TWO conversions will be made: User Code (UTF16) RTL (UTF8) WINAPI (UTF16) This would not happen because I proposed to have 2 versions of the routines in the RTL. Not 1 UTF-8 version. There would be both UTF-8 and UTF-16 versions and one would naturally use the one which matches his preferred encoding ... and the RTL would only convert the non-native version OK. The drawback is increasing file size of executables (that are already big). Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 14/9/2011 06:48, Graeme Geldenhuys wrote: On 14/09/2011 11:19, Luiz Americo Pereira Camara wrote: This is not desirable simply because at each platform (windows / unix) the user code of the same program will have a different encoding increasing the possibility of subtle errors. Why? Not every program is a text manipulation program or text parser. Most programs simply assign one string to another. eg: Button1.Caption := 'Click me'; lMyString := Button1.Caption; Given that Button1.Caption will be different under windows and unix, even if the compiler provides automatically conversion, at least some changes will be required to the default classes that handles things like (de)serialization etc and the places where these methods should be used must be checked. Moreover having different encodings in different platforms will give no gain to libraries like LCL/Lazarus like stated by DoDi. All in all my proposition is similar to yours. The only difference is that by default i suggest to be used only in RTL but nothing stops to users like you using it in a broader scope. The other difference is the name that i dont care (can be xString, MultiString, FPCString). Just i think that using UnicodeString to a variable encoding per platform will loose Delphi compatibility for no good and more: will be floods of bugreports asking why Delphi code does not work this way and asking for a fix. Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Wednesday 14 September 2011 17:02:14 Hans-Peter Diettrich wrote: Felipe Monteiro de Carvalho schrieb: On Tue, Sep 13, 2011 at 9:23 PM, Michael Van Canneyt mich...@freepascal.org wrote: One with unicode string, one with ansistring. They will have the same code, but will be compiled twice, each time with a different compiler define to decide which version it must be. Is this possible in UNIX? I can see that in Windows you can use the trick to use W versions which are identical except for the string type and drop Windows 9x support, but is this really possible for the UNIX syscalls? They expect UTF-8 not UTF-16 which is what UnicodeString uses. A few topics: [...] Agreed. And so it is made in MSEgui: - On user side all stringhandling uses type msestring which is defined as the existing Free Pascal 16bit UnicodeString. - The MSEgui widgetset works with UnicodeString too. - For file and directory access MSEgui has a set of functions which convert from/to system encoding to/from type filenamety which is defined as existing Free Pascal 16bit UnicodeString. - MSEgui has own 16bit functions and classes for lists, maps, sorting and the like. - Text files are stored in utf-8 by default. From my point of view there is no need for a complicated encoding aware unicode string type which possibly is slower, needs more memory and introduces new bugs. Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel