A UTF-x converter must handle non-characters like U+FFFE, U+FDD0, etc.

Unicode 3.0 chapter 3.8 Transformations clause D29 defines this, and the text there and below spells out that non-characters and the like must be converted as well. The change since 3.0 only affects single-surrogate code points. Non-characters should not be exchanged across system boundaries, but the converter does not necessarily define such a boundary.

markus

--
Opinions expressed here may not reflect my company's positions unless otherwise noted.




Reply via email to