On Thu, Apr 08, 2004 at 08:35:21PM -0400, Michael B Allen wrote: > This is probably states the definitive position for text handling: > > http://www.w3.org/TR/1999/WD-charmod-19991129/#Encodings > > But even though the encoding is not clearly stated as UTF-16, the Document > Object Model (DOM) which is basically the document tree inside a web > browser and key to all HTML and XML processing including JavaScript and > XSLT processing *requires* the encoding be UTF-16: > > http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578
"The UTF-16 encoding was chosen because of its widespread industry practice." Very funny; it was chosen since it's what Windows is stuck with. That aside, "all" above is incorrect. You don't have to use DOM to process HTML and XML. (Ultimately, if one *had* to use UTF-16 to process HTML, then something along the line is horribly wrong: a language specification can't legitimately make any requirements about transparent implementation details.) -- Glenn Maynard -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
