subject:"Re\: \[Lazarus\] UTF8String and UTF8Delete"

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-16 Thread Bart

On 12/10/15, Jürgen Hestermann wrote: > But now I have a problem with UTF8Strings: > With this declaration > > var S : UTF8String; > > I want to delete a character > > UTF8Delete(S,1,1); > > but I get an error that the (var) parameter mismatches. Fixed in r50850.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-16 Thread Sven Barth

On 16.12.2015 18:12, Bart wrote: On 12/10/15, Jürgen Hestermann wrote: But now I have a problem with UTF8Strings: With this declaration var S : UTF8String; I want to delete a character UTF8Delete(S,1,1); but I get an error that the (var) parameter mismatches.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-16 Thread Bart

On 12/16/15, Sven Barth wrote: > >> The code that was committed for UTF8Delete(utf8string) in >> http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/components/lazutils/lazutf8.pas?root=lazarus=50850=50849=50850 >> makes no sense: Can you continue this discussion on

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread Graeme Geldenhuys

On 2015-12-14 19:23, wkitt...@windstream.net wrote: > > yup, definitely not... there aren't as many as there used to be, wow, very interesting. > more... the big thing, today, is that we've been able to use virtual modems > which speak old style serial comms on the one side and telnet, vmodem

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread Mattias Gaertner

On Tue, 15 Dec 2015 00:29:23 + Graeme Geldenhuys wrote: > On 2015-12-14 19:23, wkitt...@windstream.net wrote: > > > > yup, definitely not... there aren't as many as there used to be, > > wow, very interesting. .., but off-topic. Please continue such a topic

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread Michael Schnell

On 12/12/2015 07:21 PM, Jürgen Hestermann wrote: As said: The docu in the wikis is very confusing and contradicting, fully understandable only for those who already know the details. This is obvious by the always repeating and long winding discussions on that issue. It supposedly can't be

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread Juha Manninen

On Mon, Dec 14, 2015 at 7:43 PM, wrote: > we're talking about text editors on traditional old school BBSes and offline > readers... the old qedit is one that comes to mind for use with offline > readers though some real masochists used edlin ;) I thought BBS was made

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread wkitty42

On 12/14/2015 01:09 PM, Juha Manninen wrote: On Mon, Dec 14, 2015 at 7:43 PM, wrote: we're talking about text editors on traditional old school BBSes and offline readers... the old qedit is one that comes to mind for use with offline readers though some real

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-14 Thread wkitty42

On 12/14/2015 04:06 AM, Graeme Geldenhuys wrote: On 2015-12-14 03:40, wkitt...@windstream.net wrote: You mean you convert from Unicode to CP437 system codepage? yes... plain text readers and editors cannot handle the fancy mess of today's world... they only know CP437 or maybe CP850... Wow,

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Jürgen Hestermann

Am 2015-12-12 um 19:34 schrieb Bart: > There is no need for such a tone, please! That's what I thought too as I read Juha's answer telling me that I was just too "dummy" to have used UTF8Strings. > Grasping the concepts of the new CP aware strings and all it's > implications is not that easy.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Bart

On 12/13/15, Mattias Gaertner wrote: >> If you write a program in Lazarus which uses the LCL (which in turn >> uses LazUtils) then this is done for you by the LazUtils package at >> initialization of the fpcadds unit. > > To be exact: It is the unit LazUTF8 of package

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Jürgen Hestermann

Am 2015-12-12 um 23:28 schrieb Juha Manninen: > On Sat, Dec 12, 2015 at 8:38 PM, wrote: >> especially those readers from the ancient past (TP/BP days) who are trying >> to catch up to the modern future... > LCL has supported UTF-8 for > 10 years. As a long time Lazarus

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Juha Manninen

On Sun, Dec 13, 2015 at 12:37 PM, Jürgen Hestermann wrote: > Then I would suggest to remove this type. FPC and Lazarus are 2 different projects. Please make such requests in FPC's lists if you must. Repeating the same whining again and again about Lazarus Unicode

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Bart

On 12/13/15, Jürgen Hestermann wrote: > > And hence, at that point in time, there would (for programmers using > > Lazarus) bee no need to use the type Utf8String at all. > > Then I would suggest to remove this type. No, it is there for Delphi compatibility, so it

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Mattias Gaertner

On Sun, 13 Dec 2015 12:46:40 +0100 Bart wrote: > On 12/13/15, Mattias Gaertner wrote: > > >> If you write a program in Lazarus which uses the LCL (which in turn > >> uses LazUtils) then this is done for you by the LazUtils package at > >>

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Sven Barth

On 13.12.2015 11:37, Jürgen Hestermann wrote: > And hence, at that point in time, there would (for programmers using > Lazarus) bee no need to use the type Utf8String at all. Then I would suggest to remove this type. Utf8String is part of the FPC RTL and is just a shorthand for

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread wkitty42

On 12/12/2015 05:28 PM, Juha Manninen wrote: On Sat, Dec 12, 2015 at 8:38 PM, wrote: especially those readers from the ancient past (TP/BP days) who are trying to catch up to the modern future... LCL has supported UTF-8 for > 10 years. As a long time Lazarus user

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Sven Barth

On 12.12.2015 18:40, Jürgen Hestermann wrote: Am 2015-12-12 um 18:20 schrieb Sven Barth: Yes, internally Windows uses UTF-16, but if you set your Windows Ansi code page or at least the current thread's locale to UTF-8 (indirectly by choosing a locale that has UTF-8 as code page, I don't know

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Sven Barth

On 13.12.2015 10:55, Jürgen Hestermann wrote: > For most cases things did not change very much. Earlier you had to use > the explicit UTF8...() functions, now you don't need them. What does this mean? If I use DELETE on a String type (that is an UTF-8 string) does it now use UTF8Delete

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread Juha Manninen

On Sun, Dec 13, 2015 at 11:18 PM, wrote: > i don't because i'm just barely dipping my toes into the UTF-8 pool... one > of my first tasks was to convert today's mess back to CP437 for posting in > pure text environments... it wasn't too hard but it was tedious with

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-13 Thread wkitty42

On 12/13/2015 05:43 PM, Juha Manninen wrote: On Sun, Dec 13, 2015 at 11:18 PM, wrote: i don't because i'm just barely dipping my toes into the UTF-8 pool... one of my first tasks was to convert today's mess back to CP437 for posting in pure text environments... it

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-11 um 19:14 schrieb Sven Barth: > Windows uses multi byte strings (one byte per character or more) > and UTF-16 (which is mostly 2 Byte and 4 for surrogate pairs). > The functions WideCharToMultiByte and MultiByteToWideChar which > are also used inside FPC for string conversions both

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread wkitty42

On 12/12/2015 10:47 AM, Bart wrote: Anyhow, as stated before, there should be noneed to use the type Utf8String in Lazarus programs. i've been trying to follow along and keep up with this but this statement confuses me... how do you designate that a string is utf8 if you don't use that type?

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 17:16 schrieb wkitt...@windstream.net: On 12/12/2015 10:47 AM, Bart wrote: Anyhow, as stated before, there should be noneed to use the type Utf8String in Lazarus programs. i've been trying to follow along and keep up with this but this statement confuses me... how do you

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 6:48 PM, Jürgen Hestermann wrote: > The problem is the "proper data type". It is UnicodeString (or maybe WideString) with Windows API 'W' functions. Everywhere else it is just String. > In the past I used UTF8String for all my strings > and

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Sven Barth

On 12.12.2015 12:46, Jürgen Hestermann wrote: Am 2015-12-11 um 19:14 schrieb Sven Barth: > Windows uses multi byte strings (one byte per character or more) > and UTF-16 (which is mostly 2 Byte and 4 for surrogate pairs). > The functions WideCharToMultiByte and MultiByteToWideChar which > are

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 18:20 schrieb Sven Barth: Yes, internally Windows uses UTF-16, but if you set your Windows Ansi code page or at least the current thread's locale to UTF-8 (indirectly by choosing a locale that has UTF-8 as code page, I don't know one right now though) then the *A functions

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 18:02 schrieb Sven Barth: On 12.12.2015 17:37, Jürgen Hestermann wrote: Is it correct that now every ansistring has a static code page and a dynamic code page (as mentioned in http://wiki.freepascal.org/FPC_Unicode_support)? Yes. Is it correct that each ansistring type can

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 18:10 schrieb Juha Manninen: > That was kind of dummy thing to do because UTF8String was an alias for > AnsiString then. > You could have used "String" always. > Now UTF8String is no more an alias. What an arrogant answer! I read it like: "You should have known that UTF8String

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread wkitty42

On 12/12/2015 11:21 AM, Juha Manninen wrote: On Sat, Dec 12, 2015 at 6:16 PM, wrote: i mean, i'm really old school... i normally use just string or string[xx]... sometimes ansistring... it is all so damned confusing now :( It is no more confusing than in Delphi

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 18:04 schrieb Bart: On 12/12/15, Jürgen Hestermann wrote: "Since FPC 2.7.1 the default system codepage of the RTL can be changed to UTF-8 (CP_UTF8). So Windows users can now use UTF-8 strings in the RTL. " It *can* be changed (but how?).

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Bart

On 12/11/15, Sven Barth wrote: > Not necessarily. You can use SetCodePage() to change the code page of > the string without triggering a codepage conversion by using the third > parameter which is a Boolean that tells the function to either do a > conversion (True;

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 5:47 PM, Bart wrote: > Move(S[1], Temp[1], Length(S)); > ... > Move(Temp[1], S[1], Length(Temp)); No good. Moving the data in memory *twice* kills performance completely. Current Utf8Delete calls Delete which also moves data but only once. >

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Bart

On 12/12/15, Jürgen Hestermann wrote: > "Since FPC 2.7.1 the default system codepage of the RTL can be changed to > UTF-8 (CP_UTF8). So Windows users can now use UTF-8 strings in the RTL. " > > It *can* be changed (but how?). DefaultSystemCodePage := CP_UTF8; Bart

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 18:21 schrieb Juha Manninen: On Sat, Dec 12, 2015 at 6:54 PM, Jürgen Hestermann wrote: Am 2015-12-10 um 18:22 schrieb Juha Manninen: http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus "String" type is UTF-8 and it works now (almost)

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Sven Barth

On 12.12.2015 16:47, Bart wrote: On 12/11/15, Sven Barth wrote: Not necessarily. You can use SetCodePage() to change the code page of the string without triggering a codepage conversion by using the third parameter which is a Boolean that tells the function to

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 6:16 PM, wrote: > i mean, i'm really old school... i normally use just string or string[xx]... > sometimes ansistring... it is all so damned confusing now :( It is no more confusing than in Delphi where the default String has UTF-16 encoding. In

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 1:46 PM, Jürgen Hestermann wrote: > Otherwise we would not have this problem and could use UTF-8 as > a standard for everything. What is the problem exactly? Always call the Windows API 'W'-functions and use proper data types for them. The

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 17:25 schrieb Juha Manninen: On Sat, Dec 12, 2015 at 1:46 PM, Jürgen Hestermann wrote: Otherwise we would not have this problem and could use UTF-8 as a standard for everything. What is the problem exactly? Always call the Windows API 'W'-functions

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-10 um 18:22 schrieb Juha Manninen: On Thu, Dec 10, 2015 at 6:49 PM, Jürgen Hestermann wrote: How can I use UTF8Delete on an UTF8string? You can't. Please read this : http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus "String" type is UTF-8 and

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Sven Barth

On 12.12.2015 17:37, Jürgen Hestermann wrote: Is it correct that now every ansistring has a static code page and a dynamic code page (as mentioned in http://wiki.freepascal.org/FPC_Unicode_support)? Yes. Is it correct that each ansistring type can store strings with any encoding (dynamic

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Bart

On 12/12/15, Sven Barth wrote: > Jonas has given me the following as a possible solution: > > === code begin === > > procedure UTF8Delete(var s: UTF8String; StartCharIndex, CharCount: PtrInt); >begin > ... >end; > > > procedure UTF8Delete(var s: String;

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 6:54 PM, Jürgen Hestermann wrote: > Am 2015-12-10 um 18:22 schrieb Juha Manninen: >> http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus >> "String" type is UTF-8 and it works now (almost) transparently without >> explicit conversions.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Bart

On 12/12/15, Jürgen Hestermann wrote: >>> It *can* be changed (but how?). >> DefaultSystemCodePage := CP_UTF8; >> > So I need to add this to my program(s) now? > Where? > In each unit? > In the main program? If you write a program in Lazarus which uses the LCL (which

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread wkitty42

On 12/12/2015 12:47 PM, Jürgen Hestermann wrote: Am 2015-12-12 um 18:21 schrieb Juha Manninen: [...] Then it explain the technical details how it was implemented. This may be true for those who coded the new string mode or those who wrote the wiki but not for the readers who tries to find

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Jürgen Hestermann

Am 2015-12-12 um 19:00 schrieb Bart: >> Then why does it say: >> "Since FPC 2.7.1 the default system codepage of the RTL can be changed to UTF-8 >> (CP_UTF8)" >> It should say: >> "Since FPC 2.7.1 the default system codepage of the RTL *is* UTF-8 (CP_UTF8)" > Why in the world would you think so?

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Bart

On 12/12/15, Jürgen Hestermann wrote: > Again a very arrogant attitude. > I read it dozen of times but it is totally confusing and contradicting. There is no need for such a tone, please! "We" ty to explain things as best as we can, but somehow fail. That's

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Mattias Gaertner

On Sat, 12 Dec 2015 16:58:38 +0100 Sven Barth wrote: >[...] > procedure UTF8Delete(var s: UTF8String; StartCharIndex, CharCount: PtrInt); >begin > ... >end; > > > procedure UTF8Delete(var s: String; StartCharIndex, CharCount: PtrInt); >var >

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Mattias Gaertner

On Sat, 12 Dec 2015 19:00:07 +0100 Bart wrote: >[...] > If you write a program in Lazarus which uses the LCL (which in turn > uses LazUtils) then this is done for you by the LazUtils package at > initialization of the fpcadds unit. To be exact: It is the unit LazUTF8 of

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 8:38 PM, wrote: > especially those readers from the ancient past (TP/BP days) who are trying > to catch up to the modern future... LCL has supported UTF-8 for > 10 years. As a long time Lazarus user you should know it. For most cases things did

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Juha Manninen

On Sat, Dec 12, 2015 at 8:21 PM, Jürgen Hestermann wrote: > But if it not only can be changed but already *is* changed > then the wiki text needs a change too. I improved the text in wiki. > As said: The docu in the wikis is very confusing and contradicting, > fully

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-12 Thread Mattias Gaertner

On Sat, 12 Dec 2015 12:43:57 -0500 wkitt...@windstream.net wrote: >[...] > > In Lazarus it is now UTF-8. Besides, it is amazingly compatible with > > Delphi at source level. Just to clarify: It is amazing how small the percentage of most program sources is that handles non English characters.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Ondrej Pokorny

On 11.12.2015 09:10, "Jürgen Hestermann" wrote: UTF8Delete probably takes an AnsiString (or String) as var parameter and for var parameters the static codepages have to match exactly (String has CP_ACP while Utf8String has CP_UTF8). Just please help me understanding this: The unit LazUTF8 unit

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

>> For what else should I use UTF8delete if not for UTF8strings? >For "UTF8 strings". >An "UTF8String" and an "UTF8 String" are two different things for the >compiler. See below. What is the difference? The link does not tell me. As fas as I know, there is no (useful) usage of

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Sven Barth

Am 11.12.2015 08:20 schrieb "Martin Schreiber" : > > On Friday 11 December 2015 08:05:12 Sven Barth wrote: > > Am 10.12.2015 23:04 schrieb "Mattias Gaertner" < nc-gaert...@netcologne.de>: > > > > > > What about: > > > > > > UTF8Delete(AnsiString(Pointer(s)),1,1); > > > > While

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 09:10:42 +0100 "Jürgen Hestermann" wrote: > UTF8Delete probably takes an AnsiString (or String) as var parameter and for > var parameters the static codepages have to match exactly (String has CP_ACP > while Utf8String has CP_UTF8). > > Just

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 08:05:12 +0100 Sven Barth wrote: >[...] > > UTF8Delete(AnsiString(Pointer(s)),1,1); > > While the typecast itself would probably work I strongly advice against it > since you're relying on implementation details. True. > Also I doubt that you

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 14:32:14 +0200 Juha Manninen wrote: > On Fri, Dec 11, 2015 at 2:12 PM, Mattias Gaertner > wrote: > > On Fri, 11 Dec 2015 08:05:12 +0100 > > Sven Barth wrote: > >> Also I doubt that you can do

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 2:29 PM, "Jürgen Hestermann" wrote: > What is the difference? The link does not tell me. Did you actually read the wiki-page? The new UTF-8 system is explained there. It is a hack but a clever one. The latest chapter by Mattias explains why

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

>> >An "UTF8String" and an "UTF8 String" are two different things for the >> >compiler. See below. >> What is the difference? The link does not tell me. >An "UTF8 String" is a String encoded in UTF-8. >String and UTF8String are two different things for the compiler. Of course String and

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 3:59 PM, Mattias Gaertner wrote: > The above literal requires {$codepage UTF8}. Damn right, I forgot to test that. It works. Juha -- ___ Lazarus mailing list Lazarus@lists.lazarus.freepascal.org

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 3:45 PM, Mattias Gaertner wrote: > With DisableUTF8RTL passing UTF8String to String changes encoding. Yes and a conversion between UTF-8 and a system codepage can be lossy. > So every function needs an overloaded wrapper function. > No need to

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 2:12 PM, Mattias Gaertner wrote: > On Fri, 11 Dec 2015 08:05:12 +0100 > Sven Barth wrote: >> Also I doubt that you can do this for var parameters... > > FPC 3.0 eats it without a hint. It does not work though.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 13:29:28 +0100 "Jürgen Hestermann" wrote: >[...] > >An "UTF8String" and an "UTF8 String" are two different things for the > >compiler. See below. > > What is the difference? The link does not tell me. An "UTF8 String" is a String encoded in

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 4:10 PM, Mattias Gaertner wrote: > The job of the wrapper is to convert to type String without > triggering the conversion of the content. Ok, lots of ugly Pointer typecasts. Doable, yes. Juha -- ___

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

Am 2015-12-11 um 16:48 schrieb Graeme Geldenhuys: On 2015-12-11 13:23, Mattias Gaertner wrote: http://wiki.freepascal.org/Character_and_string_types I haven't seen that page yet. That is a brilliant explanation of the different string types. This really should live in the FPC Language Ref

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 15:20:14 +0100 "Jürgen Hestermann" wrote: >[...] > >An "UTF8 String" is a String encoded in UTF-8. >[...] > I know the type "UTF8String" but what is an "UTF8 String" (which you say > differs)? You lost me here. Maybe I can help you if you explain

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 6:26 PM, Jürgen Hestermann wrote: > Are there any other string types with UTF8 encoding in Free Pascal? Yes. The type "String" has a default encoding of UTF-8 when you use the new Unicode support in Lazarus. See the wiki page for technical

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

Am 2015-12-11 um 16:28 schrieb Juha Manninen: On Fri, Dec 11, 2015 at 4:20 PM, "Jürgen Hestermann" wrote: I know the type "UTF8String" but what is an "UTF8 String" (which you say differs)? "UTF8 String" is a String which has UTF-8 encoding. Our UTF-8 system changes

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

Am 2015-12-11 um 17:09 schrieb Mattias Gaertner: On Fri, 11 Dec 2015 15:20:14 +0100 "Jürgen Hestermann" wrote: [...] An "UTF8 String" is a String encoded in UTF-8. [...] I know the type "UTF8String" but what is an "UTF8 String" (which you say differs)? You lost

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Graeme Geldenhuys

On 2015-12-11 13:23, Mattias Gaertner wrote: > > http://wiki.freepascal.org/Character_and_string_types I haven't seen that page yet. That is a brilliant explanation of the different string types. This really should live in the FPC Language Ref document too. Thanks for sharing. ;-) Regards, -

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 17:29:18 +0100 Jürgen Hestermann wrote: >[...] > Be aware that this wiki is outdated (not FPC 3.0). > It says: > > Currently, the type *UTF8String* is an alias to the type *AnsiString >

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 17:20:40 +0100 Jürgen Hestermann wrote: >[...] > And I am wondering why these functions in LazUTF8 unit > (which only work with UTF8 strings) do not use UTF8String > as parameter. I'm sorry. This was answered several times. Probably there is some

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Juha Manninen

On Fri, Dec 11, 2015 at 4:20 PM, "Jürgen Hestermann" wrote: > I know the type "UTF8String" but what is an "UTF8 String" (which you say > differs)? "UTF8 String" is a String which has UTF-8 encoding. Our UTF-8 system changes the default codepage of String to UTF-8 and

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Sven Barth

On 11.12.2015 15:14, Juha Manninen wrote: On Fri, Dec 11, 2015 at 4:10 PM, Mattias Gaertner wrote: The job of the wrapper is to convert to type String without triggering the conversion of the content. Ok, lots of ugly Pointer typecasts. Doable, yes. Not

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Sven Barth

On 11.12.2015 15:20, "Jürgen Hestermann" wrote: >My link explains some differences important for Lazarus. But it does not explain the difference between an "UTF8String" and an "UTF8 String" >The encoding hassle is built into Windows. The compiler cannot overcome >it. It can only give you

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Sven Barth

On 11.12.2015 13:12, Mattias Gaertner wrote: On Fri, 11 Dec 2015 08:05:12 +0100 Sven Barth wrote: [...] UTF8Delete(AnsiString(Pointer(s)),1,1); While the typecast itself would probably work I strongly advice against it since you're relying on implementation

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Jürgen Hestermann

UTF8Delete probably takes an AnsiString (or String) as var parameter and for var parameters the static codepages have to match exactly (String has CP_ACP while Utf8String has CP_UTF8). Just please help me understanding this: The unit LazUTF8 unit is for manipulating UTF8 strings only,

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 15:02:45 +0200 Juha Manninen wrote: >[...] > FYI Ondrej, having overloaded versions only for procedures taking var > parameters makes no sense even when using DisableUTF8RTL. Lossy > conversion would happen in every function using "String" type.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-11 Thread Mattias Gaertner

On Fri, 11 Dec 2015 16:03:03 +0200 Juha Manninen wrote: >[...] > The functions then use String with system codepage. > A wrapper would trigger useless conversions and potentially loose data. > No, UTF8String really must be used in the implementation. The job of the

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Sven Barth

Am 10.12.2015 19:53 schrieb "Juha Manninen" : > > On Thu, Dec 10, 2015 at 8:18 PM, Ondrej Pokorny wrote: > > IMO, there should be overloaded versions for UTF8* functions that > > explicitely accept UTF8String. > > Or am I wrong? > > You are right. I

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Mattias Gaertner

On Thu, 10 Dec 2015 19:59:26 +0100 Sven Barth wrote: > Am 10.12.2015 19:53 schrieb "Juha Manninen" : >[...] > > A typecast can be used as a workaround now. > > UTF8String and AnsiString have the same memory layout so it should work. > >

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Juha Manninen

On Thu, Dec 10, 2015 at 6:49 PM, Jürgen Hestermann wrote: > How can I use UTF8Delete on an UTF8string? You can't. Please read this : http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus "String" type is UTF-8 and it works now (almost) transparently without

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Ondrej Pokorny

On 10.12.2015 18:41, Ondrej Pokorny wrote: Why do you insist in using UTF8Delete? Use just Delete: var xUF: UTF8String; begin Delete(xUF, 1, 1); end; Sorry, of course you want to specify position/length in real chars. Ondrej -- ___ Lazarus

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Jürgen Hestermann

Am 2015-12-10 um 18:22 schrieb Juha Manninen: On Thu, Dec 10, 2015 at 6:49 PM, Jürgen Hestermann wrote: How can I use UTF8Delete on an UTF8string? You can't. Please read this : http://wiki.freepascal.org/Better_Unicode_Support_in_Lazarus "String" type is UTF-8

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Juha Manninen

On Thu, Dec 10, 2015 at 7:29 PM, Jürgen Hestermann wrote: > Then why does the compiler complain when I > feed UTF8Delete with an UTF8String? As it told you. It got UTF8String but expected AnsiString. I think it would work with a typecast but that is quite useless. Just

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Ondrej Pokorny

On 10.12.2015 18:36, Juha Manninen wrote: On Thu, Dec 10, 2015 at 7:29 PM, Jürgen Hestermann wrote: Then why does the compiler complain when I feed UTF8Delete with an UTF8String? As it told you. It got UTF8String but expected AnsiString. I think it would work with a

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Ondrej Pokorny

On 10.12.2015 18:29, Jürgen Hestermann wrote: Am 2015-12-10 um 18:22 schrieb Juha Manninen: On Thu, Dec 10, 2015 at 6:49 PM, Jürgen Hestermann wrote: How can I use UTF8Delete on an UTF8string? You can't. Please read this :

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Juha Manninen

On Thu, Dec 10, 2015 at 8:18 PM, Ondrej Pokorny wrote: > IMO, there should be overloaded versions for UTF8* functions that > explicitely accept UTF8String. > Or am I wrong? You are right. I did not even think so far yet. A typecast can be used as a workaround now. UTF8String

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Sven Barth

Am 10.12.2015 23:04 schrieb "Mattias Gaertner" : > > On Thu, 10 Dec 2015 19:59:26 +0100 > Sven Barth wrote: > > > Am 10.12.2015 19:53 schrieb "Juha Manninen" : > >[...] > > > A typecast can be used as a workaround

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Sven Barth

Am 11.12.2015 06:52 schrieb "Jürgen Hestermann" : > > > > Am 2015-12-10 um 18:22 schrieb Juha Manninen: >> >> On Thu, Dec 10, 2015 at 6:49 PM, Jürgen Hestermann >> wrote: >>> >>> How can I use UTF8Delete on an UTF8string? >> >> You can't.

Re: [Lazarus] UTF8String and UTF8Delete

2015-12-10 Thread Martin Schreiber

On Friday 11 December 2015 08:05:12 Sven Barth wrote: > Am 10.12.2015 23:04 schrieb "Mattias Gaertner" : > > > > What about: > > > > UTF8Delete(AnsiString(Pointer(s)),1,1); > > While the typecast itself would probably work I strongly advice against it > since you're

92 matches

Mail list logo