[whatwg] Content type sniffing

Boris Zbarsky Sun, 11 Jan 2009 18:42:11 -0800

I just noticed that section 2.7.1 of HTML5 says:

  Extensions must not be used for determining resource types
  for resources fetched over HTTP.

While I understand the reasons for this, there are certainly cases wherethis will break sites (basically those using HTTP 0.9, or later HTTPversions but not sending a content-type). In particular, the HTMLsniffing in the algorithm is very limited and wouldn't sniff this document:


  <body>Some text</body>

as HTML.

Now this use case (no content-type at all) was pretty common when theunknown type sniffer in Gecko was written, but that was years ago. Dowe have any data on how common it is now?


-Boris

P.S. Of course at the moment the sniffer in Gecko is used for more thanjust HTTP, and it looks like we'll need separate modes for things likeHTTP and things like file://. I can live with that, though. For thefile:// case detection of HTML in documents with nodoctype/<html>/<head> is a must.

[whatwg] Content type sniffing

Reply via email to