On Thu, 13 Aug 2015 13:23:28 +0200
Jürgen Hestermann <[email protected]> wrote:

>[...]
>  >> if ((ord(p^) and %11110000) = %11100000) then
>  >>     begin  // could be 3 byte character
>  >>     if ((ord(p[1]) and %11000000) = %10000000) and
>  >>        ((ord(p[2]) and %11000000) = %10000000) then ...
>  >>     ...
>  >> ------------
>  >> In the above (current) code 3 bytes are accessed which may step behind 
> the zero byte.
>  > The "and" operator stops evaluating if left side is already false.
> 
> Only if you have a valid UTF-8 string.

I can't follow you here. If the string is valid UTF-8 then p[1] and p[2]
are not zero.

> I thought we are talking about *invalid* UTF-8 strings where
> it can happen that p[2] is accessed although it is not part
> of the string.

I can't follow you here. If p[2] is not part of the string, then p[1]
must be #0.

Mattias

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to