Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Daniël Mantione schrieb: The issue might be the UCS-2 encoding of your source, perhaps try to feed the compiler UTF-8, I didn't even know the compiler accepts UCS-2, it may not work correctly. The compiler definitively eats no ucs-2 encoded sources.

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: A decent system should be able to do the necessary conversions automatically: This is a simplified view which ignores the resource wasting of this apporoach not visible in the academical example below. The conversion utf-8-utf-16 is a very expensive operation and the

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
The compiler definitively eats no ucs-2 encoded sources. I did check several times: My source file looks like this when I open it with Ultra-Edit and tell to show it in Hex: FF FE 75 00 6E 0069 00 74 00 20 00 55 00 6E 00 ..u.n.i.t. .U.n. Now I created a Delphi program and read the file

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef Michael Schnell: The compiler definitively eats no ucs-2 encoded sources. I did check several times: My source file looks like this when I open it with Ultra-Edit and tell to show it in Hex: FF FE 75 00 6E 0069 00 74 00 20 00 55 00 6E 00 ..u.n.i.t. .U.n.

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
As has been said before: the compiler itself simply does not support UCS-2. Regardless of any BOM, compiler setting or Lazarus setting, it will not understand it. See ,y other post in this thread: Windows XP seems to play some tricks on us here so that Ultraedit sees the UCS2 coded file

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
The conversion utf-8-utf-16 is a very expensive operation and the compiler has to insert it all over the place and people would cry about the performance of their programs. Of course I do agree. If you want to care about performance you need to know what to do: Either use WideString all over

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Vincent Snijders
Michael Schnell schreef: The conversion utf-8-utf-16 is a very expensive operation and the compiler has to insert it all over the place and people would cry about the performance of their programs. Of course I do agree. If you want to care about performance you need to know what to do:

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: The conversion utf-8-utf-16 is a very expensive operation and the compiler has to insert it all over the place and people would cry about the performance of their programs. Of course I do agree. If you want to care about performance you need to know what to do:

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
utf-16 application shouldn't do this either: it doesn't handle surrogates properly Right you are. For me WideString is UCS2 and not UTF16, as I regard it as a sequence of WideChar so that the Unicode user code can be done using WideChar and WideString. WideChar only has 16 Bits. So this

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marco van de Voort
In our previous episode, Florian Klaempfl said: But if you use UTF8String you need to be aware that you can't do simple and totally normal things like s := copy(s, 3); to get the first three characters of a string. Really finding the first three characters of a string is an interesting and

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Ultraedit might fool you here. Id edits either ansi or usc2. If you have a utf8 encoded file, it will show the contents in hex as being ucs2 That might be. But it would even virtually insert a BOPM ?!?!?!? Why should it do this when using the hex editor ? -Michael

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
More importantly, most of such routines will be implicitely tied to a certain language or language group already. Which kind of UCS2 based function do you think are tied to a language(group) ? -Michael ___ fpc-devel maillist -

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Jonas Maebe
On 23 Oct 2008, at 13:41, Michael Schnell wrote: utf-16 application shouldn't do this either: it doesn't handle surrogates properly Right you are. For me WideString is UCS2 and not UTF16, as I regard it as a sequence of WideChar so that the Unicode user code can be done using WideChar and

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: More importantly, most of such routines will be implicitely tied to a certain language or language group already. Which kind of UCS2 based function do you think are tied to a language(group) ? Bidi stuff? You are aware of the fact that unicode strings can

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Martin Schreiber
On Thursday 23 October 2008 13.31:30 Florian Klaempfl wrote: This is also a simplified view. - firstly, which real world (!) task really requires to execute an operation like this, mostly it's something like copy(s,pos(...),...); - secondly, a properly coded utf-16 application shouldn't do

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marc Weustink
Michael Schnell wrote: Ultraedit might fool you here. Id edits either ansi or usc2. If you have a utf8 encoded file, it will show the contents in hex as being ucs2 That might be. But it would even virtually insert a BOPM ?!?!?!? Why should it do this when using the hex editor ? Since it

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Bidi stuff? You are aware of the fact that unicode strings can contain e.g. bidi markers? Sorry, never heard of bidi :( -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Florian Klaempfl
Michael Schnell schrieb: Bidi stuff? You are aware of the fact that unicode strings can contain e.g. bidi markers? Sorry, never heard of bidi :( http://www.unicode.org/reports/tr9/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
If you want widestring, then maybe mseide is a better option for you. Again I do know this, and I in fact don't have a project that needs Unicode. But the cause why I started this thread is to help making Lazarus / FPC even more useful. -Michael

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
Since it converts the UTF8 file internally to UCS2 on read before editing. Seems really silly to me. But the file length really indicated that it's utf8 coded and when looking at the file with WinCommander's hex viewer it's utf-8. So I suppose that you are right and the nasty trick is

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Martin Schreiber
On Thursday 23 October 2008 13.58:04 Michael Schnell wrote: Bidi stuff? You are aware of the fact that unicode strings can contain e.g. bidi markers? Sorry, never heard of bidi :( Bidirectional text. Much more important than the hypothetical codepoints above the BMP. MSEgui does not

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
I doubt that you will never need to support decomposed characters (such as ä being encoded as basically a¨). It's not that uncommon. This is the nasty old stuff Unicode should be useful to get rid of -Michael ___ fpc-devel maillist -

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Marc Weustink
Michael Schnell wrote: Since it converts the UTF8 file internally to UCS2 on read before editing. Seems really silly to me. No it's not. This way you have internally only to support 2 editors. One with bytechars and one with wordchars (ignoring surrogates and other stuff) But the file

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Michael Schnell
http://www.unicode.org/reports/tr9/ Thanks. I see. (In fact I even did do embedded software for a display that can show Hebrew text. But this was with ANSI code.) -Michael ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello Michael, Thursday, October 23, 2008, 1:46:48 PM, you wrote: More importantly, most of such routines will be implicitely tied to a certain language or language group already. MS Which kind of UCS2 based function do you think are tied to a MS language(group) ? UpperCase, LowerCase,

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef JoshyFun: Hello Michael, Thursday, October 23, 2008, 1:46:48 PM, you wrote: More importantly, most of such routines will be implicitely tied to a certain language or language group already. MS Which kind of UCS2 based function do you think are tied to a MS

Re[3]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello Daniël, Thursday, October 23, 2008, 5:34:59 PM, you wrote: DM Don't overexagerate, this is true with plain ASCII as well. Non-English DM software exists already for over 5 decades and nothing has stopped us to DM write code that performs the functions you name. I'm not overexagerating,

Re[3]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Daniël Mantione
Op Thu, 23 Oct 2008, schreef JoshyFun: Hello Daniël, Thursday, October 23, 2008, 5:34:59 PM, you wrote: DM Don't overexagerate, this is true with plain ASCII as well. Non-English DM software exists already for over 5 decades and nothing has stopped us to DM write code that performs the

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Mattias Gaertner
On Thu, 23 Oct 2008 08:53:27 +0200 (CEST) Peter Vreman [EMAIL PROTECTED] wrote: On Wed, 22 Oct 2008 10:32:36 +0200 (CEST) Peter Vreman [EMAIL PROTECTED] wrote: As of version 2.3.1, the compiler by itself indicates all the various features it supports with FPC_HAS_FEATURE_XXX defines.

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Michael Van Canneyt
On Thu, 23 Oct 2008, Mattias Gaertner wrote: On Thu, 23 Oct 2008 08:53:27 +0200 (CEST) Peter Vreman [EMAIL PROTECTED] wrote: On Wed, 22 Oct 2008 10:32:36 +0200 (CEST) Peter Vreman [EMAIL PROTECTED] wrote: As of version 2.3.1, the compiler by itself indicates all the various

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Vincent Snijders
Michael Van Canneyt schreef: And did you fix the 'TObject not found' with a short-term solution ? :-) Maybe svn up -r11887 (in fpc/trunk) Vincent ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Michael Van Canneyt
On Thu, 23 Oct 2008, Vincent Snijders wrote: Michael Van Canneyt schreef: And did you fix the 'TObject not found' with a short-term solution ? :-) Maybe svn up -r11887 (in fpc/trunk) home: svn log -r 11887 .

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread listmember
DM Example: In Dutch uppercase characters generally do not get tremas: Daniël becomes DANIEL. Should an uppercase routine worry? No, this is a spelling convention, the correct uppercase of ë is Ë, we should not confuse spelling with uppercasing. No. This is not a spelling convention. It is

Re: [fpc-devel] FPC_HAS_FEATURE_SUPPORT

2008-10-23 Thread Vincent Snijders
Michael Van Canneyt schreef: On Thu, 23 Oct 2008, Vincent Snijders wrote: Michael Van Canneyt schreef: And did you fix the 'TObject not found' with a short-term solution ? :-) Maybe svn up -r11887 (in fpc/trunk) home: svn log -r 11887 .

Re[2]: [fpc-devel] assign constant text to widestring

2008-10-23 Thread JoshyFun
Hello listmember, Thursday, October 23, 2008, 11:58:51 PM, you wrote: l Yes, it is impretative that we know the language of the word is in, so that l UpperCase(sólo, langSpanish) -- SÓLO l UpperCase(solo, langSpanish) -- SOLO l Otherwise, we may end up altering the meaning of the text. l

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread Felipe Monteiro de Carvalho
I agree with Daniël on this one. Simplify. ë -- Ë always If you need something which takes into consideration the language then build another routine with more parameters. -- Felipe Monteiro de Carvalho ___ fpc-devel maillist -

Re: [fpc-devel] assign constant text to widestring

2008-10-23 Thread listmember
On 2008-10-24 02:46, Felipe Monteiro de Carvalho wrote: I agree with Daniël on this one. Simplify. ë -- Ë always If you need something which takes into consideration the language then build another routine with more parameters. It's not that simple. How would you uppercase this piece of