Re: [lazarus] TUTF8Char declaration

Marc Weustink Thu, 24 May 2007 13:52:37 -0700

Mattias Gaertner wrote:

On Thu, 24 May 2007 12:40:39 +0200
"Felipe Monteiro de Carvalho" <[EMAIL PROTECTED]> wrote:

Hi,

TUTF8Char is declared as:

  TUTF8Char = string[8];

Why not string[4]? Shouldn't all utf8 chars fit in 4 bytes?

Just curious, I saw this while implementing fpgui utf-8 char reception
system and thougth if I need this or not...


At the point it was not sure, if 4byte is really enough, because some
RFC talk about 6 bytes and some about combined chars.
IMO 4 bytes are enough and for combined chars we should define a
another type (string).


But is is still a char in the end ?

IIRC, you cannot tell in advance that one char in UTF8 is a compositechar. It will fit in 8 bytes.


Marc

_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Re: [lazarus] TUTF8Char declaration

Reply via email to