Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Daniël Mantione
Op Tue, 11 Nov 2008, schreef Luiz Americo Pereira Camara: Jonas Maebe escreveu: If people want to rely on what they are used to in non-unicode environments, then they cannot directly use unicode strings. They'll first have to assign it or typecast it to a non-unicode string and then operate

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Luiz Americo Pereira Camara
Jonas Maebe escreveu: If people want to rely on what they are used to in non-unicode environments, then they cannot directly use unicode strings. They'll first have to assign it or typecast it to a non-unicode string and then operate on that string. At least if there's any data loss in that ca

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Marco van de Voort
In our previous episode, Jonas Maebe said: > > > So could somebody from the core FPC team summarize or give some > > roadmap as to what is happing or planned for FPC + Unicode support? > > If anyone can, it's Florian, since he has done all the work in this > area until now. There is no roadmap

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 19:07, Graeme Geldenhuys wrote: So could somebody from the core FPC team summarize or give some roadmap as to what is happing or planned for FPC + Unicode support? If anyone can, it's Florian, since he has done all the work in this area until now. There is no roadmap docu

[fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
So could somebody from the core FPC team summarize or give some roadmap as to what is happing or planned for FPC + Unicode support? These Unicode discussions seem to go round and round and never seems to reach a conclusion. :-( So some or other roadmap or feature list for FPC on this matter wou

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Mattias Gaertner
On Tue, 11 Nov 2008 17:09:37 +0100 Michael Schnell <[EMAIL PROTECTED]> wrote: > > > AFAIK no one measured a noticeable speed difference between UTF8/16 > > when handling GUI. > > > So I don't understand why the LCL designers for the unicode upgrade > decided to use an UTF8 API instead of a Wi

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 7:10 PM, Martin Schreiber <[EMAIL PROTECTED]> wrote: >> > ??? > The last widestring manager bug I remember was in January 2007 in FPC 2.0.5. > I can't remember the exact details, but I read it a few months back on the MSEgui newsgroup. I remember the bug was widestring rela

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Martin Schreiber
On Tuesday 11 November 2008 17.34:36 Graeme Geldenhuys wrote: > > All I do know is that only recently did the WideString manager become > usable in FPC. Martin had until recently some issues with bugs in the > WideString manager. > ??? The last widestring manager bug I remember was in January 2007

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 17:34, Graeme Geldenhuys wrote: All I do know is that only recently did the WideString manager become usable in FPC. Martin had until recently some issues with bugs in the WideString manager. No functional changes have been made to the unix widestring manager since 2.2.2

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Florian Klaempfl
Graeme Geldenhuys schrieb: > On Tue, Nov 11, 2008 at 6:21 PM, Florian Klaempfl > <[EMAIL PROTECTED]> wrote: >>> Some conversions are correct or seem to be correct in that case. >> It has been already pointed out several times that lazarus abuses the >> anstring type to store utf-8 and this breaks s

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 6:21 PM, Florian Klaempfl <[EMAIL PROTECTED]> wrote: > >> Some conversions are correct or seem to be correct in that case. > > It has been already pointed out several times that lazarus abuses the > anstring type to store utf-8 and this breaks several stuff. I must have mis

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 6:09 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: > > So I don't understand why the LCL designers for the unicode upgrade decided > to use an UTF8 API instead of a WideString API (like MSEGUI does seemingly > successfully). I can't speak for the Lazarus team, but I can sp

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
It has been already pointed out several times that lazarus abuses the anstring type to store utf-8 and this breaks several stuff. Of course we do know this. But as the compiler does not tell ANSIString from UTF8String anyway (to do automatic conversions), what exactly does this mean ? -M

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Florian Klaempfl
Michael Schnell schrieb: > >> I set no special options in FPC > Lazarus does. >> and I don't use WideString at all. >> UTF-8 fits perfectly in the standard String type. >> > Some conversions are correct or seem to be correct in that case. It has been already pointed out several times that laz

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
I set no special options in FPC Lazarus does. and I don't use WideString at all. UTF-8 fits perfectly in the standard String type. Some conversions are correct or seem to be correct in that case. -Michael ___ fpc-devel maillist - fpc-devel@lis

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 6:05 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: > > So I suppose it does not set the FPC option to use UTF8String instead of > WideString for non-ASCII string constants. MSEGUI works here, too, because > of this. I set no special options in FPC and I don't use WideStrin

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
AFAIK no one measured a noticeable speed difference between UTF8/16 when handling GUI. So I don't understand why the LCL designers for the unicode upgrade decided to use an UTF8 API instead of a WideString API (like MSEGUI does seemingly successfully). -Michael ___

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Graeme Geldenhuys wrote: On Tue, Nov 11, 2008 at 4:29 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: With Lazarus even: I don't know about Lazarus, but in fpGUI Toolkit the following works just fine. So I suppose it does not set the FPC option to use UTF8String instead of WideStri

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 5:00 PM, Mattias Gaertner <[EMAIL PROTECTED]> wrote: >> OTOH, regarding an improved Lazarus, IMHO he should be enabled to >> choose or compile an LCL version with a UTF16 or UCS2 WideStrings >> API, for improved speed with GUI handling. > > AFAIK no one measured a noticeable

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 4:29 PM, Michael Schnell <[EMAIL PROTECTED]> wrote: > > With Lazarus even: I don't know about Lazarus, but in fpGUI Toolkit the following works just fine. var s1: string s2: TfpgString; // simply an alias to string begin s1 := 'äüö'; s2 := 'äüö'; Button.Text :=

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
2008/11/11 Michael Schnell <[EMAIL PROTECTED]>: > >> a) "ü": "LATIN SMALL LETTER U WITH DIAERESIS", encoded as $C3 $BC >> b) "ü": "LATIN SMALL LETTER U", encoded as $75, followed by "COMBINING >> DIAERESIS", which is encoded as $CC $88 > > I see, but I fail to see the sense of providing two differe

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Mattias Gaertner
On Tue, 11 Nov 2008 15:43:24 +0100 Michael Schnell <[EMAIL PROTECTED]> wrote: > > > We were talking of a world where strings consist of widechars, not > > about the current Lazarus, weren't we? > I'm not sure. Of course WideStrings and WideChars are easier to be > used, as in Europe and America

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Your example shows just how accurate Widestring should be interpreted: It just only shows it is a two ore more byte sequence to represend a single character. It doesn't say anything about the content or the use of a specific (meta) encoding like UCS2 or Unicode16. Of course I did mean that in

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Vincent Snijders
Jonas Maebe schreef: On 11 Nov 2008, at 15:26, Vincent Snijders wrote: Jonas Maebe schreef: It seems much more advisable to me to save the file with an UTF-8 BOM, or even better to add {$encoding utf-8} (and/or to pass -Fcutf-8 to the compiler) and then just use Edit1.Caption := UTF8Encode(

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
We were talking of a world where strings consist of widechars, not about the current Lazarus, weren't we? I'm not sure. Of course WideStrings and WideChars are easier to be used, as in Europe and America problems with surrogate pairs will seldom arise, but IMHO the user should be enabled to de

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 15:26, Vincent Snijders wrote: Jonas Maebe schreef: It seems much more advisable to me to save the file with an UTF-8 BOM, or even better to add {$encoding utf-8} (and/or to pass - Fcutf-8 to the compiler) and then just use Edit1.Caption := UTF8Encode('hallo äöü'); As a

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Thaddy
Your example shows just how accurate Widestring should be interpreted: It just only shows it is a two ore more byte sequence to represend a single character. It doesn't say anything about the content or the use of a specific (meta) encoding like UCS2 or Unicode16. If the discussion is about tho

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
That is because FPC has no unicode string type yet (as must have been repeated about 20 times by now). We are not discussing what it has, but what it should have and how this can be done in a way that provides decent performance on all platforms, is easy to use, compatible to D2009 and optim

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Vincent Snijders
Jonas Maebe schreef: On 10 Nov 2008, at 17:00, Vincent Snijders wrote: procedure TForm1.Button1Click(Sender: TObject); var w: widestring; i: integer; begin w := UTF8Decode('hallo äöü'); Edit1.Caption := UTF8Encode(w); Note that if the file has been saved using an UTF-8 BOM, then the com

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Daniël Mantione
Op Tue, 11 Nov 2008, schreef Michael Schnell: IMO widestrings with precomposed characters, just like ansistrings, can fullfill the needs of a newcomer. That there exists decomposed characters, surrogates, and more, does not need to be explained in chapter 1 of a programming for beginners

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Also, remember unicode is/are a computerlanguage specific specification(s): you may assume that a lot of thought has gone into it to be able to use it with programming languages. That was the design goal. The specification is, alas, rather complex but it contains every bit of information to b

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 15:20, Michael Schnell wrote: IMO widestrings with precomposed characters, just like ansistrings, can fullfill the needs of a newcomer. That there exists decomposed characters, surrogates, and more, does not need to be explained in chapter 1 of a programming for beginner

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
IMO widestrings with precomposed characters, just like ansistrings, can fullfill the needs of a newcomer. That there exists decomposed characters, surrogates, and more, does not need to be explained in chapter 1 of a programming for beginners book. Yep, but with s: WideString the example doe

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
From your writing I understood that the issue is a UTF8 -> 21-bit-unicode decoding issue and has nothing to do with ISO/ANSI (which would render the problem thoroughly unsolvable, not only for the compiler builder but also for the application programmer, who wants to do a unicode aware program.

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Thaddy
In response and to support Daniël: Also, remember unicode is/are a computerlanguage specific specification(s): you may assume that a lot of thought has gone into it to be able to use it with programming languages. That was the design goal. The specification is, alas, rather complex but it cont

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Daniël Mantione
Op Tue, 11 Nov 2008, schreef Michael Schnell: Remember that an individual code point does not nessacerally represent what a user would consider a character. ... Again, there is no compatible handling of this with good old ANSIStrings, anyway, so there is not "friendly old school" way that a

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 13:56, Michael Schnell wrote: If this really is two codes for the same unicode character, the "friendly old school" handling function should normalize it. If someone really needs to take the differences into account (like with the case you described), he ought to do the

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
OK, If this really is two codes for the same unicode character, the "friendly old school" handling function should normalize it. If someone really needs to take the differences into account (like with the case you described), he ought to do the appropriate code (handling subcodes). -Michael

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 13:39, Michael Schnell wrote: a) "ü": "LATIN SMALL LETTER U WITH DIAERESIS", encoded as $C3 $BC b) "ü": "LATIN SMALL LETTER U", encoded as $75, followed by "COMBINING DIAERESIS", which is encoded as $CC $88 I see, but I fail to see the sense of providing two different UTF8

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
a) "ü": "LATIN SMALL LETTER U WITH DIAERESIS", encoded as $C3 $BC b) "ü": "LATIN SMALL LETTER U", encoded as $75, followed by "COMBINING DIAERESIS", which is encoded as $CC $88 I see, but I fail to see the sense of providing two different UTF8 code variants for the same unicode character. -M

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Remember that an individual code point does not nessacerally represent what a user would consider a character. ... Again, there is no compatible handling of this with good old ANSIStrings, anyway, so there is not "friendly old school" way that a compiler would be able to offer. In these specia

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 13:15, Michael Schnell wrote: OTOH, in this special case, I don't see why the compiler should "normalize" "u¨" to "ü". If the software is supposed to be handling unicode, the unicode string "u¨" should be considered a perfectly legal two-code-point information consisting

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Because e.g. on the ext3 file system, you can have two files with the name "ü" in the same directory. One named using the single character "ü" and one named using as the string "u¨" (both in utf-8). If you make the compiler automatically normalise everything, you lose information (and get the

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Fabio Dell'Aria
Yes, exactly! ;) Thank you for your support! :) On Tue, Nov 11, 2008 at 12:55 PM, Jonas Maebe <[EMAIL PROTECTED]> wrote: > > On 11 Nov 2008, at 12:50, Fabio Dell'Aria wrote: > >> My last error is: >> >> C:/Programmi/Lazarus/fpc/2.2.2/bin/i386-win32/ppc386.exe -Ur -Xs -O2 >> -n -Fi../inc -Fi../i38

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Fabio Dell'Aria
OK, I found the error (wrong switches). ;) On Tue, Nov 11, 2008 at 12:43 PM, Jonas Maebe <[EMAIL PROTECTED]> wrote: > > On 11 Nov 2008, at 12:34, Fabio Dell'Aria wrote: > >> Hi Jonas, >> >> On Tue, Nov 11, 2008 at 12:15 PM, Jonas Maebe <[EMAIL PROTECTED]> >> wrote: >>> >>> Execute the following in

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 12:50, Fabio Dell'Aria wrote: My last error is: C:/Programmi/Lazarus/fpc/2.2.2/bin/i386-win32/ppc386.exe -Ur -Xs -O2 -n -Fi../inc -Fi../i386 -Fi../win -FE. -FUC:/fpcbuild-2.2.2/fpcsrcrtl/units/i386-win32 -CX -XX -U3 -Ur -di386 -dRELEASE -Us -Sg system.pp -Fi../win Error: Ill

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread peter green
Michael Schnell wrote: It will at best be "friendly old school behaviour which works most of the time, but which fails as soon as the strings are not completely normalised because then you can have decomposed characters and whatnot" (which in turn easily leads to security holes due to inco

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Fabio Dell'Aria
Hi, On Tue, Nov 11, 2008 at 12:43 PM, Jonas Maebe <[EMAIL PROTECTED]> wrote: > > On 11 Nov 2008, at 12:34, Fabio Dell'Aria wrote: > >> Hi Jonas, >> >> On Tue, Nov 11, 2008 at 12:15 PM, Jonas Maebe <[EMAIL PROTECTED]> >> wrote: >>> >>> Execute the following in the top FPC source directory: >>> >>>

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 12:34, Fabio Dell'Aria wrote: Hi Jonas, On Tue, Nov 11, 2008 at 12:15 PM, Jonas Maebe <[EMAIL PROTECTED] > wrote: Execute the following in the top FPC source directory: make clean all OPT="-CX -XX -U3 -Ur" After some times I receive the following error message: make.

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 12:33, Michael Schnell wrote: It will at best be "friendly old school behaviour which works most of the time, but which fails as soon as the strings are not completely normalised because then you can have decomposed characters and whatnot" (which in turn easily leads to

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Fabio Dell'Aria
Hi Jonas, On Tue, Nov 11, 2008 at 12:15 PM, Jonas Maebe <[EMAIL PROTECTED]> wrote: > > On 11 Nov 2008, at 11:03, Fabio Dell'Aria wrote: > >> how I can rebuild the FPC and RTL with custom switches? >> >> I wont to uses -CX -XX -U3 -Ur > > Execute the following in the top FPC source directory: > > m

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
It will at best be "friendly old school behaviour which works most of the time, but which fails as soon as the strings are not completely normalised because then you can have decomposed characters and whatnot" (which in turn easily leads to security holes due to incomplete checks, hard to r

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
However, the "platform" part in it, depends on the string type used that all libraries have been compiled with. I.e. regardless of your setting, "assign" would accept a ansistring or unicodestring depending on the platform, and this will be mostly dependand on wether the platform has an actua

Re: [fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 11:03, Fabio Dell'Aria wrote: how I can rebuild the FPC and RTL with custom switches? I wont to uses -CX -XX -U3 -Ur Execute the following in the top FPC source directory: make clean all OPT="-CX -XX -U3 -Ur" Jonas ___ fpc-dev

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 10:48, Michael Schnell wrote: Moreover, IMHO, it should be configurable if in "D2009 compatible mode", with s[i], length(s), pos(), copy, delete(), ..., Strings are counted in subcodes (fast behavior) or in "whatever mode" they are counted in characters ("friendly old sc

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Daniël Mantione
Op Tue, 11 Nov 2008, schreef Michael Schnell: There will have full compatibility with old code. It quite likely FPC will have a Win32 platform where string=ansistring and a WinNT platform where string=unicodestring. Other platforms will be decided on a case by case basis, i.e. there is lit

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
There will have full compatibility with old code. It quite likely FPC will have a Win32 platform where string=ansistring and a WinNT platform where string=unicodestring. Other platforms will be decided on a case by case basis, i.e. there is little point in having string=unicodestring on Dos.

Re: [fpc-devel] Zero terminated strings

2008-11-11 Thread Michael Schnell
Are strings not zero terminated? They are not: a #0 is perfectly allowable character in a (long) string. That is why you can use strings for storing any kind of byte stream. But they are: a #0 is automatically added at s[length(s)+1]; But accessing the terminating via string functions is erro

[fpc-devel] Rebuild FPC and RTL with custom switches

2008-11-11 Thread Fabio Dell'Aria
Hi to all, how I can rebuild the FPC and RTL with custom switches? I wont to uses -CX -XX -U3 -Ur -- Best regards... Fabio Dell'Aria. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Michael Schnell
We can implement a D2009 like solution and break a lot of old code :) My impression always was the FPC is supposed to be better than Delphi :) :) :) . -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Having unicode support in any way is not free. You've to rewrite your code somehow. Right, but this should only necessary with the code that explicitly is intended to benefit from unicode features. "Old school" code - using String (= ANSIString in locale-dependent coding) - should just work.

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Jonas Maebe
On 11 Nov 2008, at 09:30, Michael Schnell wrote: Edit1.Caption := UTF8Encode('hallo äöü'); Grrr, how ugly ! No "old school" Delphi user will understand/accept that you can't just do "Edit1.Caption := 'hallo äöü';" You are mixing two things here: a) you said that "Seemingly if [FPC] de

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Daniël Mantione
Op Tue, 11 Nov 2008, schreef Michael Schnell: Surely this is allowed and works correctly under D2009, otherwise I really misunderstood Unicode support in D2009. In D2009, "String" is WideString, and the VCL API is done with this (Wide)String. So this of course works. With Lazarus things ar

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Of course Lazarus' LCL would be needed to be recompiled according to the way the user wants to handle "String", as calling for conversion with any LCL in and out transfer is not a good idea. Maybe the LCL could define a type "LCLString" that can be set when compiling it. Internally I suppose t

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Object Pascal is a beautiful language because it does type handling for you - like your example of '+' for Integer or String types. In the same way I would hope that FPC can handle the String type seamlessly for UTF-16 or UTF-8 - whichever encoding the FPC developers decide String type should b

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Surely this is allowed and works correctly under D2009, otherwise I really misunderstood Unicode support in D2009. In D2009, "String" is WideString, and the VCL API is done with this (Wide)String. So this of course works. With Lazarus things are more complex, as they need to support a lot o

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Florian Klaempfl
Michael Schnell schrieb: > >> Lazarus has a set of utf-8 ready routines, using utf-8 inside of a >> ansistring. >> > I see. But it's really ugly that you need to use those instead of just > writing clean old school code and have the compiler care for the nasty > details. Having unicode support

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Florian Klaempfl
Michael Schnell schrieb: > >> OK, so here goes again yet another discussion... :-) >> > No wonder, as the current state is working, but rather disappointing :). > (No idea if D2009 is different / better: this seems to be the cause of > the new thread.) We can implement a D2009 like solution an

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Michael Schnell
Yes. In D2009 String is UTF16String and Char is WideChar, sizeof(Char)=2. I personally do not like this solution. Same here, IMHO FPC could do this in and "Delphi 20089 string compatibility" mode. But it should be configurable to use other ways (e.g. (a) String = WideString but have it c

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Michael Schnell
See (including comments) http://www.jacobthurman.com/?p=30 IMHO it's a bad decision to have the standard unicode string (be it WideString or UTF8String) functionality redefined to "Code units" (subcodes) instead of "code points" (characters). I feel it would have been better to have the old

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread petr . kristan
On Tue, Nov 11, 2008 at 10:11:10AM +0100, Michael Schnell wrote: > >> See (including comments) http://www.jacobthurman.com/?p=30 > > So it seems that the Type "String" in D2009 in fact is "WideString" and > same _does_ use surrogate pairs. This asks for even more unexpected > behavior that with F

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Michael Schnell
See (including comments) http://www.jacobthurman.com/?p=30 So it seems that the Type "String" in D2009 in fact is "WideString" and same _does_ use surrogate pairs. This asks for even more unexpected behavior that with FPC, with String seemingly still being ANSIString ;). -Michael _

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
On Tue, Nov 11, 2008 at 10:27 AM, Michael Schnell <[EMAIL PROTECTED]> wrote: > needs to do something other than just "length()" to find out the count of > characters in a string. >From memory, the Delphi and FPC documentation says that Length() returns the number of bytes, NOT the number of charac

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Graeme Geldenhuys
2008/11/11 Michael Schnell <[EMAIL PROTECTED]>: > >> Edit1.Caption := UTF8Encode('hallo äöü'); > > Grrr, how ugly ! > > No "old school" Delphi user will understand/accept that you can't just do > "Edit1.Caption := 'hallo äöü';" I agree... When I think Unicode support I think the following shou

Re: [fpc-devel] Re: Unicode support (again)

2008-11-11 Thread Michael Schnell
OK, so here goes again yet another discussion... :-) No wonder, as the current state is working, but rather disappointing :). (No idea if D2009 is different / better: this seems to be the cause of the new thread.) -Michael ___ fpc-devel maillist

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Lazarus has a set of utf-8 ready routines, using utf-8 inside of a ansistring. I see. But it's really ugly that you need to use those instead of just writing clean old school code and have the compiler care for the nasty details. -Michael ___ fp

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Edit1.Caption := UTF8Encode('hallo äöü'); Grrr, how ugly ! No "old school" Delphi user will understand/accept that you can't just do "Edit1.Caption := 'hallo äöü';" -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://list

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
IMHO, this is working fine. (Thanks for pointing out that there in fact _are_ (slow) function that can manually called to do this with UTF8Strings.) I don't doubt that it is working fine, but the usual Pascal programmer does not "expect" that (s)he manually needs to call a special functi

Re: [fpc-devel] Unicode support (again)

2008-11-11 Thread Michael Schnell
Which option? I don't remember right now. It's the default option in Lazarus ;) (This has already been discussed in another thread.). Lazarus seems to need to set this option because the LCL API is strictly UTF8 (and the UTF8String [ =ANSIString] ) type used otherwise would not get correct