Buck Golemon wrote:

Is it incorrect to say that 0x81 is a non-semantic byte in cp1252, and
to map it to the equally-non-semantic U+81 ?

This would allow systems that follow the html5 standard and use cp1252
in place of latin1 to continue to be binary-faithful and reversible.

This isn't quite as black-and-white as the question about Latin-1. If you are targeting HTML5, you are probably safe in treating an incoming 0x81 (for example) as either U+0081 or U+FFFD, or throwing some kind of error. HTML5 insists that you treat 8859-1 as if it were CP1252, so it no longer matters what the byte is in 8859-1.

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­

Reply via email to