On Thu, Apr 08, 2004 at 08:35:21PM -0400, Michael B Allen wrote:
> This is probably states the definitive position for text handling:
> 
> http://www.w3.org/TR/1999/WD-charmod-19991129/#Encodings
> 
> But even though the encoding is not clearly stated as UTF-16, the Document
> Object Model (DOM) which is basically the document tree inside a web
> browser and key to all HTML and XML processing including JavaScript and
> XSLT processing *requires* the encoding be UTF-16:
> 
> http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578

"The UTF-16 encoding was chosen because of its widespread industry practice."

Very funny; it was chosen since it's what Windows is stuck with.

That aside, "all" above is incorrect.  You don't have to use DOM to process
HTML and XML.  (Ultimately, if one *had* to use UTF-16 to process HTML, then
something along the line is horribly wrong: a language specification can't
legitimately make any requirements about transparent implementation details.)

-- 
Glenn Maynard

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to