Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Jonas Maebe
On 13 jun 2007, at 07:32, Daniël Mantione wrote: Op Wed, 13 Jun 2007, schreef Felipe Monteiro de Carvalho: How would I then be sure that my string is never converted (or always converted from utf-8 to utf-8 if prefered), but just passed like I wrote it to the library that I am using? Add

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Daniël Mantione
Op Wed, 13 Jun 2007, schreef Jonas Maebe: On 13 jun 2007, at 07:32, Daniël Mantione wrote: Op Wed, 13 Jun 2007, schreef Felipe Monteiro de Carvalho: How would I then be sure that my string is never converted (or always converted from utf-8 to utf-8 if prefered), but just passed

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Florian Klaempfl
Jonas Maebe schrieb: On 13 jun 2007, at 07:32, Daniël Mantione wrote: Op Wed, 13 Jun 2007, schreef Felipe Monteiro de Carvalho: How would I then be sure that my string is never converted (or always converted from utf-8 to utf-8 if prefered), but just passed like I wrote it to the library

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Florian Klaempfl
Jonas Maebe schrieb: On 13 jun 2007, at 11:26, Florian Klaempfl wrote: Sorry, but this view is too terminal-centric as far as I am concerned. That's not something you want to tell users of a GUI app. Or even programmers, for that matter. I really don't see a reason why this should not be

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Felipe Monteiro de Carvalho
On 6/13/07, Florian Klaempfl [EMAIL PROTECTED] wrote: Then you have to use the utf8string type. If it doesn't work good enough, we've to fix it. changing on my program from strint to utf8string didn't make any difference. The string is ok when I don't have BOM and is wrong with a BOM program

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Felipe Monteiro de Carvalho
On 6/13/07, Daniël Mantione [EMAIL PROTECTED] wrote: How hard is it to add that widestringmanager? Correct version of this program: This also doesn't change the output of the program. With either BOM or not. -- Felipe Monteiro de Carvalho ___

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Jonas Maebe
On 13 jun 2007, at 13:33, Felipe Monteiro de Carvalho wrote: changing on my program from strint to utf8string didn't make any difference. The string is ok when I don't have BOM and is wrong with a BOM I've in the mean time discovered that nl_langinfo always returns US- ASCII (or an empty

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Jeff Wormsley
What happens if you redefine your program as follows? program utftestbom; {$mode objfpc}{$H+} uses SysUtils; const MyStr: UTF8String = 'Texto ł ñ ø ß á'; var i: Integer; begin WriteLn('Printing string values'); WriteLn('Length: ', Length(MyStr)); for i := 1 to Length(MyStr) do

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Jonas Maebe
On 13 jun 2007, at 14:21, Florian Klaempfl wrote: If MacOSX uses always utf-8 for 8 bit strings, you can hardcode it of course in cwstrings and don't use iconv. Well, it's a bit more complicated than that, see http://cvs.gnupg.org/cgi-bin/viewcvs.cgi/trunk/intl/config.charset? rev=4343

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Florian Klaempfl
Jonas Maebe schrieb: On 13 jun 2007, at 14:21, Florian Klaempfl wrote: If MacOSX uses always utf-8 for 8 bit strings, you can hardcode it of course in cwstrings and don't use iconv. Well, it's a bit more complicated than that, see

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Daniël Mantione
Op Wed, 13 Jun 2007, schreef Florian Klaempfl: Jonas Maebe schrieb: On 13 jun 2007, at 14:21, Florian Klaempfl wrote: If MacOSX uses always utf-8 for 8 bit strings, you can hardcode it of course in cwstrings and don't use iconv. Well, it's a bit more complicated than that, see

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Felipe Monteiro de Carvalho
On 6/13/07, Jeff Wormsley [EMAIL PROTECTED] wrote: What happens if you redefine your program as follows? Doesn't change anything. The core problem is (like jonas said some posts ago) that we need a way to define the output of the widestring manager. One possible way to do this would, when

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-13 Thread Jonas Maebe
On 13 jun 2007, at 14:08, Jonas Maebe wrote: I've in the mean time discovered that nl_langinfo always returns US- ASCII (or an empty string) under Darwin, regardless of what your LANG/LC_* settings are. Forcing it to utf-8 fixes some problems, but there is another error in the generic code

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Jonas Maebe
On 12 jun 2007, at 09:28, Felipe Monteiro de Carvalho wrote: I edited my source code with TextWrangler (a macintosh text editor), setting the encoding to utf-8, and when I opened with Lazarus it would show the beginning of the file like this: Ôªøunit mainform; Notice the first 3 funny

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread David Pethes
Felipe Monteiro de Carvalho wrote: Does anyone know what are those funny characters? (I suppose some kind of encoding setting) Your Text editor saved those files with byte order mark (BOM) - see http://en.wikipedia.org/wiki/Byte_Order_Mark. Dunno about the second part of your question

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Felipe Monteiro de Carvalho
On 6/12/07, Jonas Maebe [EMAIL PROTECTED] wrote: You said things did initially work with the UTF-8 marker in place. I didn't say they worked. I said that the source code compiled =) What happens is that lazarus can't handle utf-8, so I open TextWrangler, edit the strings, close it, and go

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Jonas Maebe
On 12 jun 2007, at 10:00, Felipe Monteiro de Carvalho wrote: The default code page used by FPC is 8859-1. However, the scanner detects the UTF-8 marker if present, and when it finds it then it switches the code page to UTF-8. You can also set the code page manually to UTF-8 using {$codepage

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Felipe Monteiro de Carvalho
On 6/12/07, Jonas Maebe [EMAIL PROTECTED] wrote: The compiler internally stores such strings as widestrings. I don't know the details of what the scanner does exactly with utf-8 and why it does so, but there's quite a few utf-8 specific code in the scanner. Should I submit a bug report? --

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Jonas Maebe
On 12 jun 2007, at 10:17, Felipe Monteiro de Carvalho wrote: On 6/12/07, Jonas Maebe [EMAIL PROTECTED] wrote: The compiler internally stores such strings as widestrings. I don't know the details of what the scanner does exactly with utf-8 and why it does so, but there's quite a few utf-8

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Florian Klaempfl
Felipe Monteiro de Carvalho schrieb: The default code page used by FPC is 8859-1. However, the scanner detects the UTF-8 marker if present, and when it finds it then it switches the code page to UTF-8. You can also set the code page manually to UTF-8 using {$codepage utf-8}. Why does the

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Felipe Monteiro de Carvalho
Daniel Mantione wrong some coments on my bug report: http://www.freepascal.org/mantis/view.php?id=9058 Could someone elaborate on this? I did't really understand Looks somewhat illogical to me ... so I write a UTF-8 string and need a widestring managed? But I am not using widestrings ... And

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Florian Klaempfl
Felipe Monteiro de Carvalho schrieb: Daniel Mantione wrong some coments on my bug report: http://www.freepascal.org/mantis/view.php?id=9058 Could someone elaborate on this? I did't really understand Looks somewhat illogical to me ... so I write a UTF-8 string and need a widestring

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Jonas Maebe
On 12 Jun 2007, at 11:41, Florian Klaempfl wrote: Looks somewhat illogical to me ... so I write a UTF-8 string and need a widestring managed? But I am not using widestrings ... You're. String constants containing chars 127 are obviously as widestrings because when you give a code page in

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Florian Klaempfl
Jonas Maebe schrieb: On 12 Jun 2007, at 11:41, Florian Klaempfl wrote: Looks somewhat illogical to me ... so I write a UTF-8 string and need a widestring managed? But I am not using widestrings ... You're. String constants containing chars 127 are obviously as widestrings because when

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Jonas Maebe
On 12 Jun 2007, at 20:40, Florian Klaempfl wrote: There's utf8encode/decode, but it's quite annoying if you have to replace all widestring-ansistring assignments and parameter passing code with that call (especially since the type conversion from widestring to ansistring is supposed to do

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Felipe Monteiro de Carvalho
My current understanding of the issue is this: I don't tell fpc how my file is encoded, so it suposes it's latin iso. Then fpc detects the encoding of the operating system, and it sees that it's also latin iso, so no convertion is necessary. I suppose that if a different operating system

Re: [fpc-pascal] Funny things about utf-8 strings on mac

2007-06-12 Thread Daniël Mantione
Op Wed, 13 Jun 2007, schreef Felipe Monteiro de Carvalho: How would I then be sure that my string is never converted (or always converted from utf-8 to utf-8 if prefered), but just passed like I wrote it to the library that I am using? Add the cwstring unit, and run it in an utf-8 terminal.