> > However, I have just about convinced myself that we don't need 
> > IsFirstByte for matching "_" for UTF8, either preceded by "%" or
> > as it should always be true. Can anyone come up with a counter
> You have to be on a first byte before you can meaningfully 
> apply NextChar, and you have to use NextChar or else you 
> don't count characters correctly (eg "__" must match 2 chars 
> not 2 bytes).

Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xxxxxx). 

(Assuming utf16 surrogate pairs are not encoded as 2 x 3bytes, which is
not valid utf8 anyway)   


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?


Reply via email to