Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Michael Schnell
On 09/18/2011 06:49 PM, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? Allowing the compiler or library decide _if_ this is a Unicode string would require either a dedicated sting types for each encoding or "New Strings" with programmable encoding. -Michael __

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Michael Schnell
On 09/18/2011 05:52 PM, Marco van de Voort wrote: And of course, finally, there is the matter with Delphi compatibility. This can't even be discussed regarding Unicode programming as long as FPC does not have "new Strings". (AFAIK there even are or have been discussions about not doing new s

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Michael Schnell
On 09/19/2011 11:13 AM, Marco van de Voort wrote: No. IMHO the point has always been to find a sweet spot. Delphi is not Visual Basic. Delphi is native and fast. Isn't this nicely provided by "new Strings" ? If you are naive and just use them as you have been acquainted to at ANSI times, your

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Hans-Peter Diettrich
Flávio Etrusco schrieb: IMHO you are seeking the problems in the tools, while the problem is PEBKAC I partly agree it's PEBKAC, but why make it easy to get wrong when you can avoid it? Isn't that the point of Pascal? Many people think that Pascal is an educational (toy) language, and ineffi

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: > > IMHO you are seeking the problems in the tools, while the problem is PEBKAC > > I partly agree it's PEBKAC, but why make it easy to get wrong when you > can avoid it? The point is you can't. You only keep the illusion you can marginally longer at

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Jonas Maebe
On 19 Sep 2011, at 10:27, Flávio Etrusco wrote: > I partly agree it's PEBKAC, but why make it easy to get wrong when you > can avoid it? Isn't that the point of Pascal? Isn't that the point of > AnsiStrings? Isn't that the point of strong typed languages in > general? Yes, but supporting unicode

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Flávio Etrusco
On Mon, Sep 19, 2011 at 4:36 AM, Marco van de Voort wrote: > In our previous episode, Fl?vio Etrusco said: >> compatibility feature, and as such should care more about correctness >> and ease-of-use rather than performance. I thought the endless bugs >> WRT to char vs codepoint indexes, even in Ja

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Jonas Maebe
On 19 Sep 2011, at 09:36, Marco van de Voort wrote: > I don't like the Java/C# way that you have to manually allocate extra > objects (stringbuilders etc) to get(performant) access to the characters > though. In Java that's only the case for changing characters. Reading characters happens via

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: > compatibility feature, and as such should care more about correctness > and ease-of-use rather than performance. I thought the endless bugs > WRT to char vs codepoint indexes, even in Java-developed software, > would buy my argument... IMHO you are s

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Flávio Etrusco
On Sun, Sep 18, 2011 at 11:45 AM, Jonas Maebe wrote: > > On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: > >> One obvious way to mitigate this would be to store the last >> CodePoint->Char in the string record, so that at least the most common >> case is covered. > > ... and so that the common cas

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: Thanks, but that's nothing new to me in general, and the RawByteString handling doesn't work as documented. procedure ShowCodePage(const S: RawByteString); begin Form1.Caption := IntToStr(StringCodePage(S)); end; Strange What value you get passing and

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Luiz Americo Pereira Camara
On 18/9/2011 17:04, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also: http://www.micro-isv.asia/2008/08/using-rawbyt

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
DaWorm schrieb: On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth wrote: On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? All MBCS encodings, with no fixed character size, suffer from that problem. Wouldn't you also be able to do something like S

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread cobines
2011/9/18 Marco van de Voort : >  The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do >  s2[i]:=s1[i]; > > Now, to return c

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, DaWorm said: > But isn't it O(n^2) only when actually using unicode strings? > Wouldn't you also be able to do something like String.Encoding := Ansi > and then all String[i] accesses would then be o(n) + x (where x is the > overhead of run time checking that it is safe to

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread DaWorm
On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth wrote: > On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? Wouldn't you also be able to do something like String.Encoding := Ansi and then all String[i] accesses would then be o(n) + x (where x is the over

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 17:48, DaWorm wrote: On Sep 18, 2011 5:50 AM, "Marco van de Voort" mailto:mar...@stack.nl>> wrote: > > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a stri

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, DaWorm said: > > > > So instead of O(n) this loop suddenly becomes O(n^2) > > Sure it does. So what? So much! > The point is, it will do what the user expects. No it doesn't. The user has no clue, and will just stumble on the next detail (like codepoints not being

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread DaWorm
On Sep 18, 2011 5:50 AM, "Marco van de Voort" wrote: > > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do > s2[i]:=s1

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: > One obvious way to mitigate this would be to store the last > CodePoint->Char in the string record, so that at least the most common > case is covered. ... and so that the common case is broken in multithreaded environments. Directly indexing a

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Luiz Americo Pereira Camara
On 18/9/2011 10:07, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to c

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Flávio Etrusco
On Sun, Sep 18, 2011 at 6:50 AM, Marco van de Voort wrote: > In our previous episode, Fl?vio Etrusco said: >> >> That's somewhat what I was thinking. Actually something like >> >>   UnicodeString = object >> (...) > Such ability is not unique for an object. One can also do something like > that

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 13:20, Jonas Maebe wrote: On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and mer

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: > And it boggles the mind why something so broken / incomplete was > merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and merge were done by a person who only had ac

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Graeme Geldenhuys
On 18/09/2011, Sven Barth wrote: > > Currently the POSIX-based systems seem to be broken (the Windows ones > work). That is already known. The other devs are working on that. And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Isn't that the whol

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Martin Schreiber
On Sunday 18 September 2011 12.44:26 Jonas Maebe wrote: > On 18 Sep 2011, at 12:26, Sven Barth wrote: > > For now you can apply the following patch as a workaround. The compiler > > (and fpmake) will depend on the C-library then (which should not be the > > case in the final solution). > > Not onl

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 12:26, Sven Barth wrote: > For now you can apply the following patch as a workaround. The compiler (and > fpmake) will depend on the C-library then (which should not be the case in > the final solution). Not only that: even with cwstring (and under Windows) the result is wro

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linu

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: > > That's somewhat what I was thinking. Actually something like > > UnicodeString = object > strict private > FEncoding: Integer; > FBuffer: AnsiString; > function GetCodePointAt(AIndex: SizeInt): Integer; > procedure SetCodePoint

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linu

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Martin Schreiber
On Sunday 18 September 2011 10.50:26 Sven Barth wrote: > > Well... you can now take a look at trunk as well, because the changes > from cpstrnew have been merged yesterday. > [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linux' /home/mse/packs/standard/svn/fp/trunk/

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 02:22, Flávio Etrusco wrote: On Sat, Sep 17, 2011 at 10:59 AM, DaWorm wrote: This might be total crap, so bear with me a moment, In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is there not? If, as

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Flávio Etrusco
On Sat, Sep 17, 2011 at 10:59 AM, DaWorm wrote: > This might be total crap, so bear with me a moment,  In an object like > a Stringlist, there is a default property such as Strings, such that > List.Strings[1] is equivalent to List[1], is there not?  If, as in > .NET or Java, all strings become ob

Re: [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Luiz Americo Pereira Camara
On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines Ra

Re: [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines RawByteString=AnsiString, so there is no room for U

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread DaWorm
This might be total crap, so bear with me a moment, In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is there not? If, as in .NET or Java, all strings become objects, then you could have a String object whose default

RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Ludo Brands
> > >>Having UTF-16 RTL might help them in a sense they they will never > >>have to learn, until they deal with characters > >>outside of the BMP. > > > >moew old school stuff here... a BMP is a windows style graphic... > >what are you guys calling a BMP??? > > ROTFLMAO!!! > LMGTFY: "Basic Multil

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Ralf A. Quint
At 06:10 PM 9/16/2011, waldo kitty wrote: Having UTF-16 RTL might help them in a sense they they will never have to learn, until they deal with characters outside of the BMP. moew old school stuff here... a BMP is a windows style graphic... what are you guys calling a BMP??? ROTFLMAO!!! LM

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread waldo kitty
On 9/15/2011 19:03, cobines wrote: 2011/9/15 Hans-Peter Diettrich: cobines schrieb: When doing: MyChar := MyString[1] appropriate function retrieves first unicode character, regardless of encoding. This is just wrong :-( MyString[1] accesses the first element of the *physical* character arr

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 14:24, Luiz Americo Pereira Camara wrote: On 16/9/2011 14:03, Luiz Americo Pereira Camara wrote: On 16/9/2011 09:36, Marco van de Voort wrote: he UTF8 -> UTF16 conversion is done All the routines you name (fileexists, filegetattr etc) will become rawbytestring and accept both utf8

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Florian Klämpfl
Am 16.09.2011 19:24, schrieb Luiz Americo Pereira Camara: > with RawByteString (need the conversion - but how ?): > > function FileGetAttr(const FileName: RawByteString): Longint; > begin > // how to convert? > // UnicodeString(FileName) -> will not work because dont know if is > was a UTF

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 07:38, Marco van de Voort wrote: Most simple RTL routines that accept a string, but are not string type specific (think fileopen createdir etc) accept rawbytestring, a type that accepts all ansistring types and unicodestring. IOW you can also pass an UTF8 to it, even in the UTF16 rtl

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 14:03, Luiz Americo Pereira Camara wrote: On 16/9/2011 09:36, Marco van de Voort wrote: In our previous episode, Luiz Americo Pereira Camara said: Take the example of FileExists: The current LCL implementation - the UTF8 -> UTF16 conversion is done with the need of auxiliary code

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 09:36, Marco van de Voort wrote: In our previous episode, Luiz Americo Pereira Camara said: Take the example of FileExists: The current LCL implementation - the UTF8 -> UTF16 conversion is done with the need of auxiliary code: All the routines you name (fileexists, filegetattr et

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Martin schrieb: On 16/09/2011 02:49, Hans-Peter Diettrich wrote: Martin schrieb: On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that l

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: On 09/15/2011 09:01 PM, Hans-Peter Diettrich wrote: FPC also allows to use Complex values - but nobody is forced to use such numbers German (and French end, ...) Lazarus programmers are Forced to deal with Unicode if the accept user input. (Newer versions of) Lazarus

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Graeme Geldenhuys schrieb: On 16/09/2011 11:48, Felipe Monteiro de Carvalho wrote: What about stuff like this in classes: TReader = class(TFiler) function ReadString: string; function ReadWideString: WideString; function ReadUnicodeString: UnicodeString; I'm clearly not underst

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: Is migrating to multiple string types (each denoting a certain encoding) and migrating to cpstrnew (a single string type with dynamical encoding) a contradiction or can it be consolidated ? What is supposed to happen to the nasty legacy types called "String" and "Cha

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: On 09/15/2011 07:39 PM, Hans-Peter Diettrich wrote: Only when an application must *interpret* strings in foreign languages, With UTF-8 German is such a foreign language :( That's why European users will be happier with UTF-16 (meaning UCS2). An UCS2String type could

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Sven Barth
Am 16.09.2011 17:19, schrieb Tomas Hajny: Was your point about "string", or "RTLString"? I'm thinking about "string", but that is more directed towards the OOP parts, which assume a objfpc{$h+} or Delphi mode. So the base RTL functions like fileopen will be rawbytestring that accepts _all_ enc

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 14:03, Marco van de Voort wrote: > In our previous episode, Tomas Hajny said: >> . >> > In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise >> (by >> > naming them unicodestring or ansistring(..encoding) or shortstrings). >> > >> > So the same virtual m

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Luiz Americo Pereira Camara said: > > Take the example of FileExists: > > The current LCL implementation - the UTF8 -> UTF16 conversion is done > with the need of auxiliary code: All the routines you name (fileexists, filegetattr etc) will become rawbytestring and accep

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Tomas Hajny said: > . > > In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise (by > > naming them unicodestring or ansistring(..encoding) or shortstrings). > > > > So the same virtual method with a STRING parameter will be TUnicodestring > > in the UTF16

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Felipe Monteiro de Carvalho
On Fri, Sep 16, 2011 at 12:38 PM, Marco van de Voort wrote: > In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise (by > naming them unicodestring or ansistring(..encoding) or shortstrings). This is somewhat interesting, but then Lazarus and fpvectorial would only work in the UTF

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Jonas Maebe said: > > disaster. I don't want to create and maintain UTF8 versions of > > nearly every > > class, even when the class doesn't actually do anything UTF8 specific. > > If we support an UTF-8 version of the RTL, then either the code must > work both for UTF

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 12:38, Marco van de Voort wrote: > In our previous episode, Felipe Monteiro de Carvalho said: . . > In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise (by > naming them unicodestring or ansistring(..encoding) or shortstrings). > > So the same virtual

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Jonas Maebe
On 16 Sep 2011, at 12:38, Marco van de Voort wrote: What do you think about adding TStringsUTF8/TStringListUTF8 to classes.pas? I think this is a slippery slope. These kinds of hacks are slipped in one by one, and each one is only a small concession, but in the end it is a disaster. I don

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
I can understand the confusion. There are lots of old and outdated information regarding UTF-8 on the net. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist -

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Felipe Monteiro de Carvalho said: Note that this is all my, not necessarily core's opinion. > On Thu, Sep 15, 2011 at 9:14 PM, Marco van de Voort wrote: > > The assignfile() etc routines are actually not the problem. The classes in > > the classes unit are. > > Ok, I ma

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Dimitri Smits
- "Graeme Geldenhuys" schreef: > On 16/09/2011 00:01, Dimitri Smits wrote: > > > > errrm, utf-8 can have 6 octets representing one character, > > Last time I checked, that was only in the very early stages of > developing the utf-8 specification. Since then, the maximums size of > a > utf-

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 02:36, cobines wrote: 2011/9/16 Luiz Americo Pereira Camara: Lazarus can continue to use UTF-8. Just there will be an implicit conversion when using those functions. The overhead is minimum. Currently UTF8String is just an alias for AnsiString, i.e., the implicit conversion UTF8St

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 11:48, Felipe Monteiro de Carvalho wrote: > > What about stuff like this in classes: > > TReader = class(TFiler) > function ReadString: string; > function ReadWideString: WideString; > function ReadUnicodeString: UnicodeString; I'm clearly not understanding your (and

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Martin
On 16/09/2011 02:49, Hans-Peter Diettrich wrote: Martin schrieb: On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that loads a text from

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 00:01, Dimitri Smits wrote: > > errrm, utf-8 can have 6 octets representing one character, Last time I checked, that was only in the very early stages of developing the utf-8 specification. Since then, the maximums size of a utf-8 code point is 4 bytes. If you know otherwise, pleas

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Felipe Monteiro de Carvalho
On Thu, Sep 15, 2011 at 9:14 PM, Marco van de Voort wrote: > The assignfile() etc routines are actually not the problem. The classes in > the classes unit are. Ok, I may have exaggerated about the problems, but I still don't understand 100% your position. Where exactly is the frontier of how much

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 03:49, Hans-Peter Diettrich wrote: > How many users will have to deal with chars outside the Unicode BMP? >> Any app that loads a text from disk. > > Again: please answer my question first: How many *users*? Just counted 2,345,237,925 ;-) Regards, - Graeme -

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:07 PM, Felipe Monteiro de Carvalho wrote: Well, I think the RTL should introduce a TStringsUTF8 at the very least. and/or (better ?!? ) introduce a basic string type name TStringUTF8. I understand that cpstrnew is at least considered on the long run. Is migrating to multiple s

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:01 PM, Hans-Peter Diettrich wrote: FPC also allows to use Complex values - but nobody is forced to use such numbers German (and French end, ...) Lazarus programmers are Forced to deal with Unicode if the accept user input. (Newer versions of) Lazarus can't be set to work in

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/16/2011 07:36 AM, cobines wrote: Currently UTF8String is just an alias for AnsiString, Which obviously is bound to produce a lot of confusion UTF-8 code in a thing explicitly called "ANSIString" ??? -Michael ___ fpc-devel maillist - fpc-deve

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:07 PM, Felipe Monteiro de Carvalho wrote: Well, I think the RTL should introduce a TStringsUTF8 at the very least. and/or (better ?!? ) introduce a basic string type name TStringUTF8. I understand that cpstrnew is at least considered on the long run. Is migrating to multiple s

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Sven Barth
Am 16.09.2011 01:19, schrieb Flávio Etrusco: Who will be the first to write a UnicodeString object that uses an AnsiString as buffer so we can start doing some tests? What is in the cpstrnew and other unicode branches of FPC? (sorry, I'm using a 3G limited connection and FPC doesn't have a viewvc

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 01:19, Flávio Etrusco wrote: > Who will be the first to write a UnicodeString object that uses an > AnsiString as buffer so we can start doing some tests? > What is in the cpstrnew and other unicode branches of FPC? (sorry, I'm > using a 3G limited connection and FPC does

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 07:39 PM, Hans-Peter Diettrich wrote: Only when an application must *interpret* strings in foreign languages, With UTF-8 German is such a foreign language :( -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://list

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/16/2011 07:33 AM, cobines wrote: I understand that argument is not "easier to learn" but "easier to transition to from Ansi if you don't care to learn". ANSI means: each element you get is a character. With Unicode this is only (close to) true when using a 32 Bit encoding. When using 8

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/16 Luiz Americo Pereira Camara : > Lazarus can continue to use UTF-8. > > Just there will be an implicit conversion when using those functions. The > overhead is minimum. Currently UTF8String is just an alias for AnsiString, i.e., the implicit conversion UTF8String -> WideString does ansi -

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/16 Martin : > On 16/09/2011 00:03, cobines wrote: >> >> 2011/9/15 Hans-Peter Diettrich: >>> >>> cobines schrieb: When doing: MyChar := MyString[1] appropriate function retrieves first unicode character, regardless of encoding. >>> >>> This is just wrong :-( >>>

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Luiz Americo Pereira Camara
On 15/9/2011 23:11, Luiz Americo Pereira Camara wrote: On 15/9/2011 12:21, Felipe Monteiro de Carvalho wrote: Lazarus is literally being forced to implement it's own RTL... With the currently planned Unicode RTL it will just get worse, we will then need to either migrate to UnicodeString No.

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Hans-Peter Diettrich
Graeme Geldenhuys schrieb: On 15 September 2011 20:52, Hans-Peter Diettrich wrote: When I want a program for German or French users, I'll hire an coder with experience in those *languages*, not with experience only in Unicode. Why? We simply hired average people (not programmers) to translate

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Hans-Peter Diettrich
Graeme Geldenhuys schrieb: RTL is a mere *display* feature, the chars still are stored from first to My problem is not the implementation, but the fact that I can't read or understand any right-to-left languages. :) Don't worry, a compiler also doesn't understand any but his own language, o

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Luiz Americo Pereira Camara
On 15/9/2011 12:21, Felipe Monteiro de Carvalho wrote: Lazarus is literally being forced to implement it's own RTL... With the currently planned Unicode RTL it will just get worse, we will then need to either migrate to UnicodeString No. Lazarus can continue to use UTF-8. Just there will be

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Hans-Peter Diettrich
Marco van de Voort schrieb: FPC also allows to use Complex values - but nobody is forced to use such numbers without any good reason and technical (mathematical) background. The same for the use of astral Unicode characters, IMO. IMHO you are right, but the big difference is of course that th

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Hans-Peter Diettrich
Martin schrieb: On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that loads a text from disk. Again: please answer my question first: Ho

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Flávio Etrusco
Who will be the first to write a UnicodeString object that uses an AnsiString as buffer so we can start doing some tests? What is in the cpstrnew and other unicode branches of FPC? (sorry, I'm using a 3G limited connection and FPC doesn't have a viewvc...) Can we start putting well thought-out spec

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin
On 16/09/2011 00:03, cobines wrote: 2011/9/15 Hans-Peter Diettrich: cobines schrieb: When doing: MyChar := MyString[1] appropriate function retrieves first unicode character, regardless of encoding. This is just wrong :-( MyString[1] accesses the first element of the *physical* character arr

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/15 Hans-Peter Diettrich : > cobines schrieb: >> When doing: >> MyChar := MyString[1] >> >> appropriate function retrieves first unicode character, regardless of >> encoding. > > This is just wrong :-( > > MyString[1] accesses the first element of the *physical* character array, > regardless

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Dimitri Smits
- "Graeme Geldenhuys" schreef: > Why? We simply hired average people (not programmers) to translate > your English resource strings into German, Portuguese, French etc.. > No > other modifications where required. No code or programs had to be > recompiled. Our resource strings are stored in

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Dimitri Smits
- "Graeme Geldenhuys" schreef: > On 15 September 2011 19:09, Hans-Peter Diettrich wrote: > > > > What data type would you use, to store an UTF-8 character? > > And how to access the n-th character in an UTF-8 string? > > I already showed how in a previous post. For more details on how > fp

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15 September 2011 20:52, Hans-Peter Diettrich wrote: > When I want a program for German or French users, I'll hire an coder with > experience in those *languages*, not with experience only in Unicode. Why? We simply hired average people (not programmers) to translate your English resource strin

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Mattias Gaertner
On Thu, 15 Sep 2011 22:51:36 +0200 Graeme Geldenhuys wrote: > On 15 September 2011 18:43, Hans-Peter Diettrich wrote: > > > > UTF-8 is much more complicated to handle by the user, than e.g. UTF-16. > > > I don't see this. Please give an example? Please don't. This was discussed a trillion tim

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15 September 2011 19:09, Hans-Peter Diettrich wrote: > > What data type would you use, to store an UTF-8 character? > And how to access the n-th character in an UTF-8 string? I already showed how in a previous post. For more details on how fpGUI does this, have a look at the fpg_base.pas and f

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15 September 2011 17:21, Felipe Monteiro de Carvalho wrote: > > // file operations > function FileExistsUTF8(const Filename: string): boolean; > function FileAgeUTF8(const FileName: string): Longint; fpGUI has similar, for file handling functions. > With the currently planned Unicode RTL

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15 September 2011 18:43, Hans-Peter Diettrich wrote: > > UTF-8 is much more complicated to handle by the user, than e.g. UTF-16. I don't see this. Please give an example? -- Regards,   - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI to

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Marco van de Voort
In our previous episode, Hans-Peter Diettrich said: > >> Unicode users have no use for an char type, instead they have to use > >> substrings for every logical character. A Unicode BMP user could be happy > >> with a 2-byte char, of course, at his own (low) risk. > > > > Probably. But while a goo

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Marco van de Voort
In our previous episode, Felipe Monteiro de Carvalho said: > And I say more, two RTLs will immediately cause problems in all kinds > of libraries. Why? > Will the FCL work with the Ansi RTL, with the Unicode > RTL, with both? Generally both, and problematic packages are not coded in "string" bu

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Felipe Monteiro de Carvalho
On Thu, Sep 15, 2011 at 8:16 PM, Mattias Gaertner wrote: > Yes. > Or 3 - migrate LCL to UTF8String Indeed. And we can aditionally provide UTF-8 versions of routines like we do now for people to use, or alternatively people can also use Unicode RTL routines if the extra conversion is not a problem

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin
On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that loads a text from disk. it can never know what the text contains. __

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin
On 15/09/2011 19:36, Hans-Peter Diettrich wrote: Martin schrieb: On 15/09/2011 10:38, Michael Schnell wrote: On 09/15/2011 11:06 AM, Graeme Geldenhuys wrote: and to show you AGAIN how flawed your "direct index access to a character" example is. It's not "my" intend to use it. I'll never use i

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Hans-Peter Diettrich
Marco van de Voort schrieb: Unicode users have no use for an char type, instead they have to use substrings for every logical character. A Unicode BMP user could be happy with a 2-byte char, of course, at his own (low) risk. Probably. But while a good point for a application builder based in

  1   2   >