On Tue, 28 Feb 2006 09:57:09 +0200 "Panagiotis Sidiropoulos" <[EMAIL PROTECTED]> wrote:
> >so if there is something wrong with the sample I thought it should > >be gtk2 and the only problem I found was the position returned > >mismatched visually the substring > > I tried to find a relation between results but there is no any kind of > pattern, for example, for the first character give 1, the second 3 and > 21st give 41. Visually mismatch is the problem, I need to rearrange > characters for indexing reasons and can't trace what character is what > into convertion table. Jesus is right. UTF8 is a multi byte character encoding. This means a character has a size varying between 1 to 4. To get the character position use: BytePos:=System.Pos(search,text); if BytePos>0 then CharPos:=UTF8Length(Pchar(text),BytePos-1) else CharPos:=0; Mattias > > I will try to update Lazarus and FPC, just to be sure. > > Panagiotis > > -----Original Message----- > From: Jesus Reyes [mailto:[EMAIL PROTECTED] > Sent: Tuesday, February 28, 2006 8:30 AM > To: [email protected] > Subject: Re: [lazarus] String functions on non latin text > > > > ----- Original Message ----- > From: "Mattias Gaertner" <[EMAIL PROTECTED]> > To: <[email protected]> > Sent: Monday, February 27, 2006 2:45 PM > Subject: Re: [lazarus] String functions on non latin text > > > > On Mon, 27 Feb 2006 13:41:13 -0600 (CST) > > Jesus Reyes <[EMAIL PROTECTED]> wrote: > > > > > > > > --- Panagiotis Sidiropoulos <[EMAIL PROTECTED]> escribió: > > > > > > > Please download sample project at: > > > > - www.magentadb.gr/ftp/pos-sample.zip > > > > > > > > Panagiotis > > > > > > > > > > result := Pos(UTF8Decode(SubStr), UTF8Decode(Str)); > > > > > > seems to work, I think Pos(UTF8String,UTF8String) is yet to be > > > implemented. > > > > It does not need to be implemented. One nice feature of UTF8 is, that > > you can find out the start of an UTF8 character without parsing the > > whole string. A simple substring search works with UTF8 and is > > unambiguous. > > I guess it would depend on the need for the pos function return value, > if some feedback should be made to the user about the position the > substring matched > then current pos functions doesn't not return a visually right position, > I mean > counting characters form left to right, the correct position should be > 21 not 41. > > If the value is to be user with other string functions then the return > value is right. > > if the function is ever implemented I think it should be for something > like > pos(UTFString,UTFString) where UTFString should represent any UTF > Encoding in use. Unlikely? maybe :D > > > On the other hand: UTF8Decode will fail on some character sets, not > > fitting into 2byte characters. > > it seems to have support for at least 3 byte chars. I didn't test tho.. > > > > > My guess, why a simple Pos does not work for Panagiotis, is a either a > > > FPC bug or a gtk1 bug with greek characters. > > > > I compiled the test first for gtk1 and results looked right to me, so if > there is something wrong with the sample I thought it should be gtk2 and > the only problem I found was the position returned mismatched visually > the substring > > > > > Mattias > > > > Jesus Reyes A. > > __________________________________________________ > Correo Yahoo! > Espacio para todos tus mensajes, antivirus y antispam ¡gratis! > Regístrate ya - http://correo.yahoo.com.mx/ > > _________________________________________________________________ > To unsubscribe: mail [EMAIL PROTECTED] with > "unsubscribe" as the Subject > archives at http://www.lazarus.freepascal.org/mailarchives > > _________________________________________________________________ > To unsubscribe: mail [EMAIL PROTECTED] with > "unsubscribe" as the Subject > archives at http://www.lazarus.freepascal.org/mailarchives _________________________________________________________________ To unsubscribe: mail [EMAIL PROTECTED] with "unsubscribe" as the Subject archives at http://www.lazarus.freepascal.org/mailarchives
