Are you sure? €, U+20AC is represented in UTF-8 as 0xE2, 0x82, 0xAC. The middle byte & 0xC0 == 0.
On Sun, Jan 18, 2009 at 7:26 PM, Shmuel Fomberg <[email protected]> wrote: > Hi. > > I've been reading a bit about utf8, and I learned that when reading a > utf8 character, for each byte I need to check: > (byte & 0xC0 ) == 0xC0 > means that there is another byte for this character. Otherwise, it's the > last byte of the character. > > Shmuel. > _______________________________________________ > Perl mailing list > [email protected] > http://perl.org.il/mailman/listinfo/perl > -- Gaal Yahas <[email protected]> http://gaal.livejournal.com/ _______________________________________________ Perl mailing list [email protected] http://perl.org.il/mailman/listinfo/perl
