I mean, that's how the code works, so it must be possible. :) Adam
On Sun, Oct 23, 2011 at 8:32 PM, Larry Masinter <[email protected]> wrote: > I know it's complicated, but scanning text is necessarily part of determining > which application/something+xml you have. I think (but should really check > before saying this) that XML media type registrations describe what the > DOCTYPE or XML namespace or root element are, and that, to properly "sniff" > them, you'd have to scan text. But before you scan text, you have to > determine charset. > > So if we're going to support sniffing of media types in general, I don't see > how we can do that without also specifying charset determination. > > > > Larry > ] > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of > Adam Barth > Sent: Sunday, October 23, 2011 8:28 PM > To: Tobias Gondrom > Cc: [email protected] > Subject: Re: [websec] #22: content-type sniffing should include charset > sniffing > > The charset sniffing is also complicated by the fact that sometimes user > agents need to parse some of the HTML to find a <meta> element. > In some situations, user agents need to restart the parsing algorithm, which > is quite delicate and better to describe in the same document as HTML parsing > (at least for use by HTML processing engines). > > Adam > > > On Sun, Oct 23, 2011 at 8:24 PM, Tobias Gondrom <[email protected]> > wrote: >> <hat="individual"> >> I tend not to agree with that. >> >> The fact that charset sniffing might happen at the same time as >> mime-sniffing does not seem like a strong argument to include this in >> the draft. >> >> Furthermore I would rather have these issues separate: >> First you determine the content-type and then after that you may want >> to determine the charset used within that content-type (if you really >> have to sniff the charset). I can also imagine that charset sniffing >> algorithm might be depending on the application identified by the >> sniffed mime-type, which again would speak against throwing it in together >> with mime-sniffing.... >> >> Kind regards, Tobias >> >> >> >> On 24/10/11 00:55, websec issue tracker wrote: >>> >>> #22: content-type sniffing should include charset sniffing >>> >>> the HTML5 spec contains some algorithms for sniffing charset, >>> overriding >>> labeled charset, etc. >>> >>> MIME parameters like charset are as much a part of the content-type >>> as the >>> base internet media type, and any sniffing of parameters and other >>> metadata (overriding content-type or guessing where it is not >>> supplied or >>> wrong) should be included in this document, since the sniffing will >>> happen >>> at the same time. >>> >> >> _______________________________________________ >> websec mailing list >> [email protected] >> https://www.ietf.org/mailman/listinfo/websec >> > _______________________________________________ > websec mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/websec > _______________________________________________ websec mailing list [email protected] https://www.ietf.org/mailman/listinfo/websec
