Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Luca Olivetti
En/na Martin Schreiber ha escrit: I'd say to take a look at how python managed to integrate unicode support: http://www.google.com/search?domains=www.python.orgsitesearch=www.python.o rgsourceid=google-searchq=unicodesubmit=search They have a UTF-16/UCS-2 internal representation, same as

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Mon, Jun 30, 2008 at 11:35 AM, Marco van de Voort [EMAIL PROTECTED] wrote: borders? Gtk can load XML files, somewhat equivalent to our LFMs. They use UTF-8 everywhere. GTK is unix centric on other systems. They don't have a firm leg in both the Unix as the Windows world as we do. I

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need characters beyond $ you have to compile it with wide unicode support, and in that case every character

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Luca Olivetti
En/na Marco van de Voort ha escrit: They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need characters beyond $ you have to compile it with wide unicode support, and in

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner
On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: En/na Marco van de Voort ha escrit: They have a UTF-16/UCS-2 internal representation, same as MSEgui which works very well and is fast and handy BTW. And len, slicing, etc. work as expected. Note that if you need

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
En/na Marco van de Voort ha escrit: with wide unicode support, and in that case every character will use 4 bytes. That's IMHO a faulty system. It requires you to choose between an incomplete solution or making strings a horrible memory hog. OTOH using variable length characters will

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner
On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) [EMAIL PROTECTED] (Marco van de Voort) wrote: [...] multiple encodings: Are we talking about one encoding per platform or two encodings for all platforms? Under Unix the encoding preference is clear: UTF-8. Under Windows there are a lot of current code

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 09.56:29 Mattias Gaertner wrote: On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: OTOH using variable length characters will make string operations expensive (since you can't just multiply the index by 2 or 4 but you have to examine the

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) (note that this is all IMHO, not necessarily core viewpoint) Are we talking about one encoding per platform or two encodings for all platforms? My proposition was: Two encodings, two stringtypes for all. Florian's stand was thinking about one

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner
On Tue, 1 Jul 2008 10:23:32 +0200 Martin Schreiber [EMAIL PROTECTED] wrote: On Tuesday 01 July 2008 09.56:29 Mattias Gaertner wrote: On Tue, 01 Jul 2008 09:35:35 +0200 Luca Olivetti [EMAIL PROTECTED] wrote: OTOH using variable length characters will make string operations expensive

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gaertner
On Tue, 1 Jul 2008 10:33:28 +0200 (CEST) [EMAIL PROTECTED] (Marco van de Voort) wrote: On Tue, 1 Jul 2008 09:23:52 +0200 (CEST) (note that this is all IMHO, not necessarily core viewpoint) Same for me: mine are not lazarus core. Are we talking about one encoding per platform or two

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 10.35:00 Mattias Gaertner wrote: A good example is text layout calculation where it is necessary to iterate over characters (glyphs) over and over again. Text layout nowadays need to consider font widths and unicode specials. Iterating from character to character

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008 10:33:28 +0200 (CEST) all platforms? My proposition was: Two encodings, two stringtypes for all. Both at the same time? Yes, utf8string and utf16string. Whatever Tiburon introduces aliased to utf16string, so that will be compat on non-windows too. And the utf16

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner
Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 10.35:00 Mattias Gaertner wrote: A good example is text layout calculation where it is necessary to iterate over characters (glyphs) over and over again. Text layout nowadays need to consider font widths and unicode

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 12.19:26 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is the code in msegui? (the code that was formerly UTF-8, not the old UTF-8 code)

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner
Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is the code in msegui? (the code that was formerly UTF-8, not

Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
I read most of the discussion and I think there is no way around a string type containing an encoding field. First, it allows also to support non utf encodings or utf-32 encoding. Having the encoding field does not mean that all target support all encoding. In case an encoding is not supported,

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 13.13:19 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2, beleave me, it was not negligible. Where is

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 4:23 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Certainly. Can you imagine loading a non trivial file in a tstringlist and saving it again and the heaps of conversions? And how do you know that the file to be loaded will be in the system encoding? We should simply

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to solve things for once and

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 4:23 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Certainly. Can you imagine loading a non trivial file in a tstringlist and saving it again and the heaps of conversions? And how do you know that the file to be loaded will be in the system encoding? Not at all.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to solve things for once

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
A string type which you don't know the encoding is very inconvenient, because you need to convert it to something else anytime you wish to do any routine which will require knowing the encoding. How will Pos be implemented? And UpperCase? Any cross-platform string manipulation routine will

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Marco van de Voort wrote: On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and requires some work but I think this is the only way to

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:02 AM, Marco van de Voort [EMAIL PROTECTED] wrote: A solution for unicode should be for everything, not just for UIs and filenames. I should be able to carry data within it also, because otherwise we are having this dicussion next week again if Joost needs unicode for

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Florian Klaempfl wrote: Marco van de Voort wrote: On Tue, 1 Jul 2008, Florian Klaempfl wrote: I read most of the discussion and I think there is no way around a string type containing an encoding field. [cut] I know this approach contains some hacks and

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Vincent Snijders
Florian Klaempfl schreef: Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. Even if the system encoding is UTF8? Vincent

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Graeme Geldenhuys
2008/7/1 Felipe Monteiro de Carvalho [EMAIL PROTECTED]: In my system I propose that simply a TWideStringList be implemented, so both ways of storing data are available everwhere. I have a TWideStringList implementation if you are interrested. I got the code somewhere and kept it for a rainy

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 9:02 AM, Marco van de Voort [EMAIL PROTECTED] wrote: A solution for unicode should be for everything, not just for UIs and filenames. I should be able to carry data within it also, because otherwise we are having this dicussion next week again if Joost needs unicode

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 14.03:19 Felipe Monteiro de Carvalho wrote: About UCS-2 this is absurd. We certainlly cannot have half the chinese characters ignored in the Free Pascal RTL. ??? Where did you get the information that half of the Chinese characters won't fit in base plane? And utf-16

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Vincent Snijders wrote: Florian Klaempfl schreef: Felipe Monteiro de Carvalho wrote: ansistrings don't mean everything. They mean either ISO or utf-8. This assumption is wrong. ansistring means the system encoding which uses 8 bit chars. Even if the system encoding is UTF8? Then it

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? This proposal is at least better then the one from Marco as we at least can get the encoding somehow, but is still inconvenient for cross-platform software. -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Felipe Monteiro de Carvalho wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:28 AM, Martin Schreiber [EMAIL PROTECTED] wrote: Where did you get the information that half of the Chinese characters won't fit in base plane? http://unicode.org/roadmaps/sip/index.html CJK means Chinese Japanese Korean -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? This proposal is at least better then the one from Marco My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. It is just that on unix,

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:21 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Well, euh, the main reason is that euh, most programs and data on the system uses the system encoding? So you are saying that FPC should privilege platform-specific software development to cross-platform software

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. Why isn't is cross-platform? -- Felipe Monteiro de Carvalho

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort [EMAIL PROTECTED] wrote: I don't like the runtime nature. At all. I want to be able to say hey look, I've a bunch of units here, and they only accept utf16, (e.g. because they were ported Tiburon code). Convert if necessary Tiburon code will

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 9:21 AM, Marco van de Voort [EMAIL PROTECTED] wrote: Well, euh, the main reason is that euh, most programs and data on the system uses the system encoding? So you are saying that FPC should privilege platform-specific software development to cross-platform

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort [EMAIL PROTECTED] wrote: My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. Unless I understood your proposal wrong it involves a TMarcoString which will be declared like

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 9:42 AM, Marco van de Voort [EMAIL PROTECTED] wrote: I don't like the runtime nature. At all. I want to be able to say hey look, I've a bunch of units here, and they only accept utf16, (e.g. because they were ported Tiburon code). Convert if necessary Tiburon

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Felipe Monteiro de Carvalho wrote: On Tue, Jul 1, 2008 at 9:24 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Why not just introduce a set of utf-16 routines with utf16string type like the new Delphi? Because it's not cross platform. Why isn't is cross-platform? Because using utf-16 on

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
[ Charset ISO-8859-1 unsupported, converting... ] On Tue, Jul 1, 2008 at 9:30 AM, Marco van de Voort [EMAIL PROTECTED] wrote: My is having both an UTF8string and a UTF16string, on all platforms that support unicode. So I don't get this remark. Unless I understood your proposal wrong it

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Because using utf-16 on linux is very unnatural, same for utf-8 on windows. Platforms like go32 even don't have any unicode. Coding platform independent but fast applications is really ugly having fixed types. Well,

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
It is just that on unix, the fileroutines will be defined as utf8string So you are going to convert in non utf8 unix? Maybe I should have said in the native encoding then. So if the it's a utf-16 unix it will be utf-16. In principle at least. We will have to see how this fares with the

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: platform independent but fast applications is really ugly having fixed types. Well, then you mean that it requires conversion in some platforms rather then it not being cross-platform. What I am trying to say is

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: On Tue, Jul 1, 2008 at 9:56 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: Because using utf-16 on linux is very unnatural, same for utf-8 on windows. Platforms like go32 even don't have any unicode. Coding platform independent but

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
It is just that on unix, the fileroutines will be defined as utf8string So you are going to convert in non utf8 unix? Maybe I should have said in the native encoding then. So if the it's a utf-16 unix it will be utf-16. In principle at least. We will have to see how this fares with

[fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Felipe Monteiro de Carvalho
Hello, Ok, so far we have a couple of unicode rtl proposals. Will FPC remain compatible with code from Tiburon? If yes, then we need a set of utf-16 RTL routines, even if another solution is chosen. Yes, surely the utf-16 routines could just call the other routines and do a string conversion, if

Re: [fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Marco van de Voort
Ok, so far we have a couple of unicode rtl proposals. Will FPC remain compatible with code from Tiburon? If yes, then we need a set of utf-16 RTL routines, even if another solution is chosen. Yes, surely the utf-16 routines could just call the other routines and do a string conversion, if

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that the two of you agreed on such a runtime construct. It goes

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 10:28 AM, Marco van de Voort [EMAIL PROTECTED] wrote: C/C++ support the native encoding on all platforms. I did some googling and they don't support unicode filenames. So we are back to zero systems using this method again =)

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 10:50 AM, Felipe Monteiro de Carvalho I did some googling and they don't support unicode filenames. So we are back to zero systems using this method again =) Actually I think that Carbon uses a system very similar to the one proposed by Florian. The string is an opaque

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that the

Re: [fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Felipe Monteiro de Carvalho
On Tue, Jul 1, 2008 at 10:35 AM, Marco van de Voort [EMAIL PROTECTED] wrote: See my earlier reply. A few var parameters need to be overriden, that's it. Ok, this is very good. So it doesn't matter which system will be choosen I can still assume that the rtl uses a single unicode encoding: utf-16

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Jeff Wormsley
Marco van de Voort wrote: I don't understand how this can work, how can I have a compiletime solution for a runtime problem? procedure mystringproc (s:FlorianUnicodeString); begin if encodingof(s)=utf-16 then begin // utf-16 code here with shiftsize 2 [] needed end else

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Jeff Wormsley wrote: Marco van de Voort wrote: I don't understand how this can work, how can I have a compiletime solution for a runtime problem? procedure mystringproc (s:FlorianUnicodeString); begin if encodingof(s)=utf-16 then begin //

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I don't see what is difficult about Florians proposition. On the contrary, it is the simplest possible solution, and quite elegant in my eyes. To be honest, I flabbergasted that

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
Marco van de Voort wrote: If compiler magic is at work, wouldn't all this reduce to s[1] giving the first char no matter the char size? Where does the magic gets its information is my point. ___ fpc-pascal maillist -

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
On Tue, 1 Jul 2008, Jeff Wormsley wrote: is defined as char, it gets converted to a standard 0-255 value, but c could be defined as FlorianChar and be the native char size. Or am I smoking crack? No, you understand it correct. Obviously, with Florian's type, simple low-level access

Re: [fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Marco van de Voort
On Tue, Jul 1, 2008 at 10:35 AM, Marco van de Voort [EMAIL PROTECTED] wrote: See my earlier reply. A few var parameters need to be overriden, that's it. Ok, this is very good. So it doesn't matter which system will be choosen I can still assume that the rtl uses a single unicode encoding:

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Marco van de Voort wrote: On Tue, 1 Jul 2008, Jeff Wormsley wrote: is defined as char, it gets converted to a standard 0-255 value, but c could be defined as FlorianChar and be the native char size. Or am I smoking crack? No, you understand it correct.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Marco van de Voort wrote: Marco van de Voort wrote: If compiler magic is at work, wouldn't all this reduce to s[1] giving the first char no matter the char size? Where does the magic gets its information is my point. I described this already in detail in my first mail: just in one of

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Paul Ishenin
Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you are doing - it will not harm. In other case, if you corrupt the string then

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you are doing - it will not

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S) should represent S as raw bytes. If you know what you

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Felipe Monteiro de Carvalho
I think you can still do the byte-size operations this way: ForceEncoding(S, iso-) P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Similarly for any other code supposing an encoding. -- Felipe Monteiro de Carvalho ___ fpc-pascal maillist -

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Michael Van Canneyt
On Tue, 1 Jul 2008, Felipe Monteiro de Carvalho wrote: I think you can still do the byte-size operations this way: ForceEncoding(S, iso-) P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Similarly for any other code supposing an encoding. Absolutely. Michael.

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 17.06:34 Florian Klaempfl wrote: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot?

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner
Zitat von Florian Klaempfl [EMAIL PROTECTED]: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you cannot? PChar(S)

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Mattias Gärtner
Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 13.13:19 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: On Tuesday 01 July 2008 12.19:26 Mattias Gärtner wrote: Zitat von Martin Schreiber [EMAIL PROTECTED]: I did it with utf-8 and UCS-2,

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Florian Klaempfl
Mattias Gärtner wrote: Zitat von Florian Klaempfl [EMAIL PROTECTED]: Michael Van Canneyt wrote: On Tue, 1 Jul 2008, Paul Ishenin wrote: Michael Van Canneyt wrote: You can still do C:=S[i]. What you cannot do is P:=PChar(S); While (P^#0) do SomeByteSizedOperation; Why you

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and inc(pwidechar)/dec(pwidechar) are used often. This can't be done with utf-8 strings. Ehm, do you

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marco van de Voort
Mattias G?rtner wrote: example you could tell it that all strings should be utf-8 encoded. Of course, you get into trouble if some user plays unfair but you could still protect your code with some EnforceUTF8Encoding. It's exactly the See earlier mail. Tiburon code shouldn't need mods. That

Re: Summary on Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marc Weustink
Florian Klaempfl wrote: [..some of my thoughts..] this suits a construct I saw somewhere: type SomeString = type String(CP_KOI8); Marc ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Marc Weustink
Martin Schreiber wrote: On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and inc(pwidechar)/dec(pwidechar) are used often. This can't be done with utf-8

Re: [fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Andrew Haines
Felipe Monteiro de Carvalho wrote: Hello, Ok, so far we have a couple of unicode rtl proposals. Will FPC remain compatible with code from Tiburon? If yes, then we need a set of utf-16 RTL routines, even if another solution is chosen. Yes, surely the utf-16 routines could just call the

Re: [fpc-pascal] Tiburon and unicode rtl

2008-07-01 Thread Jonas Maebe
Wiadomość napisana w dniu 01 Jul 2008, o godz. 22:38, przez Andrew Haines: Does gnu pascal have fpc compatible strings? or are they the same as strings in {$mode fpc} GNU Pascal only supports strings defined in terms of a schema: http://www.gnu-pascal.de/gpc/Schema-Types.html (which FPC

Re: [fpc-pascal] Unicode file routines proposal

2008-07-01 Thread Martin Schreiber
On Tuesday 01 July 2008 22.23:12 Marc Weustink wrote: Martin Schreiber wrote: On Tuesday 01 July 2008 18.32:30 Mattias Gärtner wrote: In this routines length(widestring), widestring[index], pwidechar^, pwidechar[index], pwidechar + offset, pwidechar - pwidechar and