Re: pre-HTML5 and the BOM

Leif Halvard Silli Tue, 17 Jul 2012 01:36:10 -0700

Philippe Verdy, Tue, 17 Jul 2012 03:40:37 +0200:
> 2012/7/16 Leif Halvard Silli:


HTML5:

> (ASCII is considered now an alias of Windows-1252, also for
> compatibiluty reasons, even if strict US-ASCII resources could be
> interpreted without changes as UTF-8)

I agree that HTML5 ought to ask UAs to, more aggressively, try to 
detect UTF-8. And an argument was put forward in the WHATWG mailinglist 
earlier tis year/end of previous year, that a page with strict ASCII 
characters inside could still contain character entities/references for 
characters outside ASCII. For instance, early on in 'the Web', some 
appeared to think that all non-ASCII had to be represented as entities.

> and require explicit encoding
> (sniffing no longer works for something else as UTF-8 for its leading
> BOM interpreted as a data signature and not as a character)

If that was true, then Firefox' (for most locales) optional character 
encoding detector would not be compatible with HTML5. And also, Chrome 
would violate HTML5. I do not think that HTML5 rules out detection of 
encodings that HTML5 permit/requires UAs to support. However, the 
encoding sniffing algorithm specifies at which stages in the sniffing 
process the 'try-to-guess' step should happen.
-- 
Leif Halvard Silli

Re: pre-HTML5 and the BOM

Reply via email to