You're right, Ian. This changes my understanding of when the replacement of the numerical character references must occur in the conversion of the multi-byte CJK character sets.
And it makes another thing I posted this morning incorrect. I still can't understand why I can't see the numerical references in Henry's page. Is it a setting of some sort? On Fri, 16 Aug 2002 11:37:32 +0100 Ian Collier <[EMAIL PROTECTED]> wrote >On Fri, Aug 16, 2002 at 12:36:23AM -0700, Steve White wrote: >> The SGML 'Α' is not synonymous with the numeric character >> reference 'Α', Henry. Generally, the numeric character references >> such as 'Α' refer to a character **in the document's character set**. > >Sorry, this is not so. HTML 4.01 is quite explicit that: > > The syntax "&#D;", where D is a decimal number, refers to the > ISO 10646 decimal character number D. > >[http://www.w3.org/TR/html4/charset.html#entities] > >Please note that the "document character set" in HTML 4.01 is ISO 10646 >(defined in section 5.1 of the above document). This is distinct from >each individual document's character encoding (which, to confuse >matters, is specified using the word "charset" in the Content-Type >header). > >imc > ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]
