I want to make sure that people are not mislead by that paper. There is a note below 
that section that:

"Note: The italicized names are not yet registered, but are useful for reference."

and "UTF-8N" is italicized. It is not a registered name, and should not be used 
outside of a closed system.

The reason I make that notational distinction in the text is that there is a danger 
with UTF-8 currently: BOM can be used with it, and some people do. Since, unlike the 
case of UTF-16 / UTF-16BE / UTF-16LE, there is no way to distinguish between 
implementations that allow a BOM and those that don't, the situation is slightly 
unstable: if you find EF BB BF at the start of a UTF-8 file, you don't know whether to 
delete it or not.

In XML, this situation does not arise, since it specifies the exact useage of BOM, but 
it can arise in other circumstances.

Mark

Masahiko Maedera wrote:

> I found UTF-8N in the following URL.
>
> www-4.ibm.com/software/developer/library/utfencodingforms/index.html
>
> I have understood the meaning and the format of UTF-8N.
> But I don't make sure how it will be treated in future.
>
> Does anyone have plan to regist new charset UTF-8N,
> or any other information about it?
>
> Thank you in advance.
>
> --
>   Masahiko Maedera.

Reply via email to