Re: [whatwg] Video with MIME type application/octet-stream

Boris Zbarsky Tue, 07 Sep 2010 06:28:00 -0700

On 9/7/10 9:16 AM, Philip Jägenstedt wrote:

UTF-8, Big5 and GBK are all (as far as I know) ASCII supersets. Do
real-world text documents include \0 bytes?


Yes.  Real-world text documents include all sorts of gunk.  Just rarely.

As long as "indicates an encoding" doesn't include UTF-8 or ISO-8859-1
(thanks, Apache!), that should be reasonable, I think.


Are you saying that Apache has, at various times, set the default
character encoding to UTF-8 or ISO-8859-1?

Yes, precisely. Though the UTF-8 stuff was Linux distros, I think, notApache itself (in that Apache just sent the thing passed toAddDefaultCharset and they changed the value of that from ISO-8859-1 toUTF-8 in their distro packages). Here's the relevant comment from theGecko source where we do our text-or-binary sniffing for toplevel contexts:


 Make sure to do a case-sensitive exact match comparison here.  Apache
 1.x just sends text/plain for "unknown", while Apache 2.x sends
 text/plain with a ISO-8859-1 charset.  Debian's Apache version, just to
 be different, sends text/plain with iso-8859-1 charset.  For extra fun,
 FC7, RHEL4, and Ubuntu Feisty send charset=UTF-8.  Don't do general
 case-insensitive comparison, since we really want to apply this crap as
 rarely as we can.

I was hoping that no encoding parameter at all would be sent :/

Heh. I've long since given up all hope of reason on this stuff; I justtry to keep it as sane and predictable and simple as possible. :(


-Boris

Re: [whatwg] Video with MIME type application/octet-stream

Reply via email to