Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: compatibility feature, and as such should care more about correctness and ease-of-use rather than performance. I thought the endless bugs WRT to char vs codepoint indexes, even in Java-developed software, would buy my argument... IMHO you are

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Jonas Maebe
On 19 Sep 2011, at 09:36, Marco van de Voort wrote: I don't like the Java/C# way that you have to manually allocate extra objects (stringbuilders etc) to get(performant) access to the characters though. In Java that's only the case for changing characters. Reading characters happens via a

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Jonas Maebe
On 19 Sep 2011, at 10:27, Flávio Etrusco wrote: I partly agree it's PEBKAC, but why make it easy to get wrong when you can avoid it? Isn't that the point of Pascal? Isn't that the point of AnsiStrings? Isn't that the point of strong typed languages in general? Yes, but supporting unicode

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: IMHO you are seeking the problems in the tools, while the problem is PEBKAC I partly agree it's PEBKAC, but why make it easy to get wrong when you can avoid it? The point is you can't. You only keep the illusion you can marginally longer at a

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Hans-Peter Diettrich
Flávio Etrusco schrieb: IMHO you are seeking the problems in the tools, while the problem is PEBKAC I partly agree it's PEBKAC, but why make it easy to get wrong when you can avoid it? Isn't that the point of Pascal? Many people think that Pascal is an educational (toy) language, and

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Michael Schnell
On 09/18/2011 05:52 PM, Marco van de Voort wrote: And of course, finally, there is the matter with Delphi compatibility. This can't even be discussed regarding Unicode programming as long as FPC does not have new Strings. (AFAIK there even are or have been discussions about not doing new

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-19 Thread Michael Schnell
On 09/18/2011 06:49 PM, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? Allowing the compiler or library decide _if_ this is a Unicode string would require either a dedicated sting types for each encoding or New Strings with programmable encoding. -Michael

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 02:22, Flávio Etrusco wrote: On Sat, Sep 17, 2011 at 10:59 AM, DaWormdaw...@gmail.com wrote: This might be total crap, so bear with me a moment, In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Martin Schreiber
On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linux'

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, Fl?vio Etrusco said: That's somewhat what I was thinking. Actually something like UnicodeString = object strict private FEncoding: Integer; FBuffer: AnsiString; function GetCodePointAt(AIndex: SizeInt): Integer; procedure SetCodePoint(AIndex:

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 12:26, Sven Barth wrote: For now you can apply the following patch as a workaround. The compiler (and fpmake) will depend on the C-library then (which should not be the case in the final solution). Not only that: even with cwstring (and under Windows) the result is

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Martin Schreiber
On Sunday 18 September 2011 12.44:26 Jonas Maebe wrote: On 18 Sep 2011, at 12:26, Sven Barth wrote: For now you can apply the following patch as a workaround. The compiler (and fpmake) will depend on the C-library then (which should not be the case in the final solution). Not only that:

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Graeme Geldenhuys
On 18/09/2011, Sven Barth wrote: Currently the POSIX-based systems seem to be broken (the Windows ones work). That is already known. The other devs are working on that. And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Isn't that the whole

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and merge were done by a person who only had

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 13:20, Jonas Maebe wrote: On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Flávio Etrusco
On Sun, Sep 18, 2011 at 6:50 AM, Marco van de Voort mar...@stack.nl wrote: In our previous episode, Fl?vio Etrusco said: That's somewhat what I was thinking. Actually something like   UnicodeString = object (...) Such ability is not unique for an object. One can also do something like

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Jonas Maebe
On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: One obvious way to mitigate this would be to store the last CodePoint-Char in the string record, so that at least the most common case is covered. ... and so that the common case is broken in multithreaded environments. Directly indexing a

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread DaWorm
On Sep 18, 2011 5:50 AM, Marco van de Voort mar...@stack.nl wrote: The trouble is that it is not that easy, consider the first thing a long time pascal user will do is fix his existing code which has many constructs that loop over a string: setlength(s2,s1); for i:=1 to length(s1) do

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, DaWorm said: So instead of O(n) this loop suddenly becomes O(n^2) Sure it does. So what? So much! The point is, it will do what the user expects. No it doesn't. The user has no clue, and will just stumble on the next detail (like codepoints not being

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Sven Barth
On 18.09.2011 17:48, DaWorm wrote: On Sep 18, 2011 5:50 AM, Marco van de Voort mar...@stack.nl mailto:mar...@stack.nl wrote: The trouble is that it is not that easy, consider the first thing a long time pascal user will do is fix his existing code which has many constructs that loop

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread DaWorm
On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth pascaldra...@googlemail.com wrote: On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? Wouldn't you also be able to do something like String.Encoding := Ansi and then all String[i] accesses would then be

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Marco van de Voort
In our previous episode, DaWorm said: But isn't it O(n^2) only when actually using unicode strings? Wouldn't you also be able to do something like String.Encoding := Ansi and then all String[i] accesses would then be o(n) + x (where x is the overhead of run time checking that it is safe to

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread cobines
2011/9/18 Marco van de Voort mar...@stack.nl:  The trouble is that it is not that easy, consider the first thing a long time pascal user will do is fix his existing code which has many constructs that loop over a string: setlength(s2,s1); for i:=1 to length(s1) do  s2[i]:=s1[i]; Now, to

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
DaWorm schrieb: On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth pascaldra...@googlemail.com wrote: On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? All MBCS encodings, with no fixed character size, suffer from that problem. Wouldn't you also be

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also:

Re: [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: Thanks, but that's nothing new to me in general, and the RawByteString handling doesn't work as documented. procedure ShowCodePage(const S: RawByteString); begin Form1.Caption := IntToStr(StringCodePage(S)); end; Strange What value you get passing and

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-18 Thread Flávio Etrusco
On Sun, Sep 18, 2011 at 11:45 AM, Jonas Maebe jonas.ma...@elis.ugent.be wrote: On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: One obvious way to mitigate this would be to store the last CodePoint-Char in the string record, so that at least the most common case is covered. ... and so that

RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Ludo Brands
Having UTF-16 RTL might help them in a sense they they will never have to learn, until they deal with characters outside of the BMP. moew old school stuff here... a BMP is a windows style graphic... what are you guys calling a BMP??? ROTFLMAO!!! LMGTFY: Basic Multilingual Plane... :-D

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread DaWorm
This might be total crap, so bear with me a moment, In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is there not? If, as in .NET or Java, all strings become objects, then you could have a String object whose default

Re: [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Hans-Peter Diettrich
Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines RawByteString=AnsiString, so there is no room for

Re: [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Luiz Americo Pereira Camara
On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines

Re: RE : [fpc-devel] Unicode support (yet again)

2011-09-17 Thread Flávio Etrusco
On Sat, Sep 17, 2011 at 10:59 AM, DaWorm daw...@gmail.com wrote: This might be total crap, so bear with me a moment,  In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is there not?  If, as in .NET or Java, all

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/16/2011 07:33 AM, cobines wrote: I understand that argument is not easier to learn but easier to transition to from Ansi if you don't care to learn. ANSI means: each element you get is a character. With Unicode this is only (close to) true when using a 32 Bit encoding. When using 8 or

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 07:39 PM, Hans-Peter Diettrich wrote: Only when an application must *interpret* strings in foreign languages, With UTF-8 German is such a foreign language :( -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 01:19, Flávio Etrusco wrote: Who will be the first to write a UnicodeString object that uses an AnsiString as buffer so we can start doing some tests? What is in the cpstrnew and other unicode branches of FPC? (sorry, I'm using a 3G limited connection and FPC doesn't

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:07 PM, Felipe Monteiro de Carvalho wrote: Well, I think the RTL should introduce a TStringsUTF8 at the very least. and/or (better ?!? ) introduce a basic string type name TStringUTF8. I understand that cpstrnew is at least considered on the long run. Is migrating to multiple

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/16/2011 07:36 AM, cobines wrote: Currently UTF8String is just an alias for AnsiString, Which obviously is bound to produce a lot of confusion UTF-8 code in a thing explicitly called ANSIString ??? -Michael ___ fpc-devel maillist -

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:01 PM, Hans-Peter Diettrich wrote: FPC also allows to use Complex values - but nobody is forced to use such numbers German (and French end, ...) Lazarus programmers are Forced to deal with Unicode if the accept user input. (Newer versions of) Lazarus can't be set to work in

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Michael Schnell
On 09/15/2011 09:07 PM, Felipe Monteiro de Carvalho wrote: Well, I think the RTL should introduce a TStringsUTF8 at the very least. and/or (better ?!? ) introduce a basic string type name TStringUTF8. I understand that cpstrnew is at least considered on the long run. Is migrating to multiple

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 03:49, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that loads a text from disk. Again: please answer my question first: How many *users*? Just counted 2,345,237,925 ;-) Regards, - Graeme - -- fpGUI

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Felipe Monteiro de Carvalho
On Thu, Sep 15, 2011 at 9:14 PM, Marco van de Voort mar...@stack.nl wrote: The assignfile() etc routines are actually not the problem. The classes in the classes unit are. Ok, I may have exaggerated about the problems, but I still don't understand 100% your position. Where exactly is the

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 00:01, Dimitri Smits wrote: errrm, utf-8 can have 6 octets representing one character, Last time I checked, that was only in the very early stages of developing the utf-8 specification. Since then, the maximums size of a utf-8 code point is 4 bytes. If you know otherwise, please

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Martin
On 16/09/2011 02:49, Hans-Peter Diettrich wrote: Martin schrieb: On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that loads a text from

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
On 16/09/2011 11:48, Felipe Monteiro de Carvalho wrote: What about stuff like this in classes: TReader = class(TFiler) function ReadString: string; function ReadWideString: WideString; function ReadUnicodeString: UnicodeString; I'm clearly not understanding your (and a few

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 02:36, cobines wrote: 2011/9/16 Luiz Americo Pereira Camaraluiz...@oi.com.br: Lazarus can continue to use UTF-8. Just there will be an implicit conversion when using those functions. The overhead is minimum. Currently UTF8String is just an alias for AnsiString, i.e., the implicit

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Dimitri Smits
- Graeme Geldenhuys graemeg.li...@gmail.com schreef: On 16/09/2011 00:01, Dimitri Smits wrote: errrm, utf-8 can have 6 octets representing one character, Last time I checked, that was only in the very early stages of developing the utf-8 specification. Since then, the maximums size

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Felipe Monteiro de Carvalho said: Note that this is all my, not necessarily core's opinion. On Thu, Sep 15, 2011 at 9:14 PM, Marco van de Voort mar...@stack.nl wrote: The assignfile() etc routines are actually not the problem. The classes in the classes unit are.

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Graeme Geldenhuys
I can understand the confusion. There are lots of old and outdated information regarding UTF-8 on the net. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ ___ fpc-devel maillist -

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Jonas Maebe
On 16 Sep 2011, at 12:38, Marco van de Voort wrote: What do you think about adding TStringsUTF8/TStringListUTF8 to classes.pas? I think this is a slippery slope. These kinds of hacks are slipped in one by one, and each one is only a small concession, but in the end it is a disaster. I

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 12:38, Marco van de Voort wrote: In our previous episode, Felipe Monteiro de Carvalho said: . . In the UTF8 RTL, all strings _ARE_ utf8, unless specified otherwise (by naming them unicodestring or ansistring(..encoding) or shortstrings). So the same virtual method

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Jonas Maebe said: disaster. I don't want to create and maintain UTF8 versions of nearly every class, even when the class doesn't actually do anything UTF8 specific. If we support an UTF-8 version of the RTL, then either the code must work both for UTF-16 and

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Felipe Monteiro de Carvalho
On Fri, Sep 16, 2011 at 12:38 PM, Marco van de Voort mar...@stack.nl wrote: In the UTF8 RTL, all strings _ARE_ utf8, unless specified otherwise (by naming them unicodestring or ansistring(..encoding) or shortstrings). This is somewhat interesting, but then Lazarus and fpvectorial would only

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Tomas Hajny said: . In the UTF8 RTL, all strings _ARE_ utf8, unless specified otherwise (by naming them unicodestring or ansistring(..encoding) or shortstrings). So the same virtual method with a STRING parameter will be TUnicodestring in the UTF16 rtl and

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Marco van de Voort
In our previous episode, Luiz Americo Pereira Camara said: Take the example of FileExists: The current LCL implementation - the UTF8 - UTF16 conversion is done with the need of auxiliary code: All the routines you name (fileexists, filegetattr etc) will become rawbytestring and accept

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Tomas Hajny
On Fri, September 16, 2011 14:03, Marco van de Voort wrote: In our previous episode, Tomas Hajny said: . In the UTF8 RTL, all strings _ARE_ utf8, unless specified otherwise (by naming them unicodestring or ansistring(..encoding) or shortstrings). So the same virtual method with a

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Sven Barth
Am 16.09.2011 17:19, schrieb Tomas Hajny: Was your point about string, or RTLString? I'm thinking about string, but that is more directed towards the OOP parts, which assume a objfpc{$h+} or Delphi mode. So the base RTL functions like fileopen will be rawbytestring that accepts _all_

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: On 09/15/2011 07:39 PM, Hans-Peter Diettrich wrote: Only when an application must *interpret* strings in foreign languages, With UTF-8 German is such a foreign language :( That's why European users will be happier with UTF-16 (meaning UCS2). An UCS2String type

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: Is migrating to multiple string types (each denoting a certain encoding) and migrating to cpstrnew (a single string type with dynamical encoding) a contradiction or can it be consolidated ? What is supposed to happen to the nasty legacy types called String and Char

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Graeme Geldenhuys schrieb: On 16/09/2011 11:48, Felipe Monteiro de Carvalho wrote: What about stuff like this in classes: TReader = class(TFiler) function ReadString: string; function ReadWideString: WideString; function ReadUnicodeString: UnicodeString; I'm clearly not

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Michael Schnell schrieb: On 09/15/2011 09:01 PM, Hans-Peter Diettrich wrote: FPC also allows to use Complex values - but nobody is forced to use such numbers German (and French end, ...) Lazarus programmers are Forced to deal with Unicode if the accept user input. (Newer versions of)

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Hans-Peter Diettrich
Martin schrieb: On 16/09/2011 02:49, Hans-Peter Diettrich wrote: Martin schrieb: On 15/09/2011 19:52, Hans-Peter Diettrich wrote: Graeme Geldenhuys schrieb: On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? Any app that

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 09:36, Marco van de Voort wrote: In our previous episode, Luiz Americo Pereira Camara said: Take the example of FileExists: The current LCL implementation - the UTF8 - UTF16 conversion is done with the need of auxiliary code: All the routines you name (fileexists, filegetattr

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 14:03, Luiz Americo Pereira Camara wrote: On 16/9/2011 09:36, Marco van de Voort wrote: In our previous episode, Luiz Americo Pereira Camara said: Take the example of FileExists: The current LCL implementation - the UTF8 - UTF16 conversion is done with the need of auxiliary

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 07:38, Marco van de Voort wrote: Most simple RTL routines that accept a string, but are not string type specific (think fileopen createdir etc) accept rawbytestring, a type that accepts all ansistring types and unicodestring. IOW you can also pass an UTF8 to it, even in the UTF16

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Florian Klämpfl
Am 16.09.2011 19:24, schrieb Luiz Americo Pereira Camara: with RawByteString (need the conversion - but how ?): function FileGetAttr(const FileName: RawByteString): Longint; begin // how to convert? // UnicodeString(FileName) - will not work because dont know if is was a UTF8 or

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Luiz Americo Pereira Camara
On 16/9/2011 14:24, Luiz Americo Pereira Camara wrote: On 16/9/2011 14:03, Luiz Americo Pereira Camara wrote: On 16/9/2011 09:36, Marco van de Voort wrote: he UTF8 - UTF16 conversion is done All the routines you name (fileexists, filegetattr etc) will become rawbytestring and accept both utf8

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread waldo kitty
On 9/15/2011 19:03, cobines wrote: 2011/9/15 Hans-Peter Diettrichdrdiettri...@aol.com: cobines schrieb: When doing: MyChar := MyString[1] appropriate function retrieves first unicode character, regardless of encoding. This is just wrong :-( MyString[1] accesses the first element of the

Re: [fpc-devel] Unicode support (yet again)

2011-09-16 Thread Ralf A. Quint
At 06:10 PM 9/16/2011, waldo kitty wrote: Having UTF-16 RTL might help them in a sense they they will never have to learn, until they deal with characters outside of the BMP. moew old school stuff here... a BMP is a windows style graphic... what are you guys calling a BMP??? ROTFLMAO!!!

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/14/2011 05:02 PM, Hans-Peter Diettrich wrote: The NT WinAPI (not 9x) *implements* everything in the Wide (UTF-16) routines without reference-counting: Yak ! -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/14/2011 05:19 PM, Hans-Peter Diettrich wrote: Can you specify, *which* strings ever *require* platform specific encoding? If not strings, Chars do: MyString := 'Öse'; MyChar := MyString[1]; -Michael ___ fpc-devel maillist -

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/14/2011 07:24 PM, Hans-Peter Diettrich wrote: Unicode users have no use for an char type, instead they have to use substrings for every logical character. Yep. The problem is that a normal programmer (especially a beginner) does not know (and does not want to know, and IMHO should not

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 14/09/2011 17:02, Hans-Peter Diettrich wrote: Many users still want simple string handling, with direct mapping between logical and physical chars (SBCS). This is not possible at all with UTF-8, while UTF-16 works fine with the BMP, at least. What rubbish! The only utf-8 limit is that

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 06:19, Martin Schreiber wrote: Agreed. And so it is made in MSEgui: Yeah, and everything you said applies to fpGUI, except I use TfpgString, TfpgChar and the UTF-8 encoding. Though I would prefer having the native encoding on each platform - thus less conversions and

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 09:43, Michael Schnell wrote: there are lots of file systems in Linux. They can work differently. i.e. FAT works case insensitive while ext* works cases sensitive. IMHO ext completely ignores character coding and just works on byte arrays. You also forgot to mention that most

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Felipe Monteiro de Carvalho
On Thu, Sep 15, 2011 at 1:50 AM, Luiz Americo Pereira Camara luiz...@oi.com.br wrote: OK. The drawback is increasing file size of executables (that are already big). And disk storages are getting each time bigger. Any modern smartphone comes with at least 8GB of storage ... -- Felipe Monteiro

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Marco van de Voort
In our previous episode, Hans-Peter Diettrich said: Lazarus was forced to make out of the identity of ANSIString and UTF8String seemingly forced by FPC. e.g.: Old programs assuming local ANSI 8 bit code retrieved from LCL GUI components, compiled with the new version don't work (e.g.

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 14/09/2011 19:17, Hans-Peter Diettrich wrote: How many users will have to deal with chars outside the Unicode BMP? You are very narrow minded! It depends on the application you are developing. Lets take a Science application as an example. Many scientific symbols fall

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 10:16, Michael Schnell wrote: In fact users want to deal with decently coded characters and not with cryptic bytes some of which together are representing a character. (e.g. when doing MyChar := MyString[1]; ) None of our company's users using our products would even

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/15 Michael Schnell mschn...@lumino.de: In fact users want to deal with decently coded characters and not with cryptic bytes some of which together are representing a character. (e.g. when doing MyChar := MyString[1]; ) I think of Unicode text as a stream of Unicode characters in some

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin Schreiber
On Thursday 15 September 2011 10:27:28 Graeme Geldenhuys wrote: And considering the amount of text processing apps I have written (plenty of them), indexed character access is really not a top priority or a often used feature. Graeme, please have look into:

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Felipe Monteiro de Carvalho
On Thu, Sep 15, 2011 at 10:59 AM, Martin Schreiber mse00...@gmail.com wrote: There are plenty of user problems with utf-8 character access and string length. I assume 100% of them would be solved with utf-16. It would solve for those writing buggy software which ignores part of the Unicode

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 09:53, Michael Schnell wrote: If not strings, Chars do: MyString := 'Öse'; MyChar := MyString[1]; and to show you AGAIN how flawed your direct index access to a character example is. How is that 'Öse' entered into the system. Is the Ö a U+00D6 LATIN CAPITAL LETTER O WITH

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 11:06, Felipe Monteiro de Carvalho wrote: It would solve for those writing buggy software which ignores part of the Unicode characters. On the other hand 100% of them would be solved correctly, for all Unicode characters by using LCLProc.UTF8CharacterLength +1 on both

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 10:59, Martin Schreiber wrote: There are plenty of user problems with utf-8 Then they are not well versed in Unicode are they... character access in fpGUI: UTF8Copy(...) UTF8CharAtByte(...) and string length. in fpGUI: Length(...) result is in bytes

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 10:39 AM, Graeme Geldenhuys wrote: MyChar := UTF8Copy(MyString, 1, 1); The above example is safe, Of course. But generations of Pascal programmers have been trained to do MyChar := MyString[1]; So it would at least be candid to abolish the String[i] notation as a syntax

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 10:43 AM, cobines wrote: MyChar := MyString[1] appropriate function retrieves first unicode character, regardless of encoding. MyChar is an 8 bit thingy and thus is not even able to hold a Unicode 'ä' (in what ever UTF). -Michael ___

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 11:08 AM, Graeme Geldenhuys wrote: +1 on both counts. Hoping for complex things to automatically be solved by just ignoring the complexity usually leads into hell. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 11:06 AM, Graeme Geldenhuys wrote: and to show you AGAIN how flawed your direct index access to a character example is. It's not my intend to use it. I'll never use it as I do know that it is bound to create problems. But it is what generations of pascal programmers are trained

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin Schreiber
On Thursday 15 September 2011 11:06:00 Felipe Monteiro de Carvalho wrote: On Thu, Sep 15, 2011 at 10:59 AM, Martin Schreiber mse00...@gmail.com wrote: There are plenty of user problems with utf-8 character access and string length. I assume 100% of them would be solved with utf-16. It

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/14/2011 07:24 PM, Hans-Peter Diettrich wrote: Unicode users have no use for an char type, instead they have to use substrings for every logical character. Unicode is 32 Bis and allowing for (nearly) any supported character. So a 32 Bit UnicodeCharacter in fact is very viable. -Michael

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin Schreiber
On Thursday 15 September 2011 11:15:22 Graeme Geldenhuys wrote: And now there should be an even more complex string type implemented? UTF-8 is not more complex at all. A new encoding aware FPC string type is more complex. Martin ___ fpc-devel

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/15 Michael Schnell mschn...@lumino.de: Of course. But generations of Pascal programmers have been trained to do MyChar := MyString[1]; Such people should retrain if they want to switch to Unicode using some instructions how to convert your application. If they do not want, they should

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Martin
On 15/09/2011 10:38, Michael Schnell wrote: On 09/15/2011 11:06 AM, Graeme Geldenhuys wrote: and to show you AGAIN how flawed your direct index access to a character example is. It's not my intend to use it. I'll never use it as I do know that it is bound to create problems. But it is what

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 11:40 AM, cobines wrote: Such people should retrain if they want to switch to Unicode using some instructions how to convert your application. In fact they don't know about Unicode and thus don't even know that they need training :-) . If they do not want, they should stay with

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Michael Schnell
On 09/15/2011 11:43 AM, Martin wrote: Which imho makes utf8 far more preferable than utf16 in UTF8 the error is bound to happen ROFL. -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread Graeme Geldenhuys
On 15/09/2011 11:38, Michael Schnell wrote: is bound to create problems. But it is what generations of pascal programmers are trained to do. They all need to be re-trained. Just like all Delphi developers had to since Delphi 2009+. Obviously, the RTL could cater for it, or implement more

Re: [fpc-devel] Unicode support (yet again)

2011-09-15 Thread cobines
2011/9/15 Michael Schnell mschn...@lumino.de: If thy use Lazarus they are forced to use Unicode unless they wand to stick with a very old version. Then this is a problem of Lazarus. They want to make applications with Lazarus (which is always UTF-8?) and they are unaware of using Unicode where

  1   2   >