[whatwg] UTF-16 encoding default

Kartikaya Gupta Tue, 23 Jun 2009 18:42:34 -0700

There's a page 
(http://www.microsoft.com/windowsmobile/mobile/en-us/totalaccess/software/software/eula-sw-netflix.mspx
 specifically) that has a Content-Type header of "text/html; charset=utf-16" 
and has no BOM. The references I've seen (RFC2781, as well as 
http://unicode.org/faq/utf_bom.html#gen7) say that this means the content 
should be assumed to be UTF-16BE. The page, however, is actually in UTF-16LE.


All browsers seem to do some sort of unspecified magic and figure out that the 
page is in LE. I was wondering if that magic could be described and added to 
the HTML5 spec so that it covers rendering the above page as expected. 
According to the draft spec as it stands, I believe that page should be 
rendered as garbage.

Cheers,
kats

PS - the page also has a meta tag that says the charset is iso-8859-1. *sigh*

[whatwg] UTF-16 encoding default

Reply via email to