Re: [Lazarus] GB18030 support in Lazarus

Mattias Gaertner Fri, 16 Oct 2015 07:00:50 -0700

On Fri, 16 Oct 2015 14:33:03 +0100
Martin Frb <[email protected]> wrote:


> On 16/10/2015 10:19, Tony Whyman wrote:
> >
> > In terms of "work", if I use functions such as UTF8Length and 
> > ValidUTF8String on a GB18030 string should they always work, or are 
> > there exceptions?
> 
> IIRC ... UTF8Length counts codepoints, not chars. So if the chars you 
> are interested in have chars that need more than one codepoint then this 
> is not the  length in char.

True.

> This can even happen with some western languages, but it is not likely 
> with them.

Actually decomposed characters are pretty common in western languages,
for example on OS X HFS+. And afaik Chinese in Unicode usually use
precomposed characters, does it not?

 
> The same is for char accessing function (NextUtf8CharByteLen or 
> similar). They only get codepoints.

Mattias

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] GB18030 support in Lazarus

Reply via email to