Larry has been consistently using
OxAB op 0xBB
in his messages to represent a (French quote) hyperop,
(corresponding to the Unicode characters 0x00AB and 0x00BB)
which is consistent with the iso-8859-1 encoding (despite
the fact that my mailserver or his mailer insists on
labelling those messages as UTF-8).
However, the UTF-8 encoding of those Unicode characters
actually is:
0xC2AB op 0xC2BB
.. As far as I understand it, the UTF-8 encoding only allows
single byte representations of characters if they fall in the
0x00 to 0x7F range.
So the question is, if I'm writing a program and I actually
want to use one of these ops, do I put
0xAB op 0xBB
or
0xC2AB op 0xC2BB
?
-- Matt,
who'd never thought he'd have to do hex dumps to debug
his Perl programs ;)
--
Matthew Zimmerman
Interdisciplinary Biophysics, University of Virginia
http://www.people.virginia.edu/~mdz4c/