I was talking about the necessary dependency of the specifications -- that you 
couldn't specify media type sniffing completely without making at least a 
normative reference to charset sniffing. 

The fact that the code works that way is evidence, of course, but we're not 
talking about possibility of implementation (where a single implementation is 
evidence) but rather orthogonality of interfaces (where the question is whether 
ALL implementations must follow this pattern.)

Larry




-----Original Message-----
From: Adam Barth [mailto:[email protected]] 
Sent: Sunday, October 23, 2011 8:37 PM
To: Larry Masinter
Cc: Tobias Gondrom; [email protected]
Subject: Re: [websec] #22: content-type sniffing should include charset sniffing

I mean, that's how the code works, so it must be possible.  :)

Adam


On Sun, Oct 23, 2011 at 8:32 PM, Larry Masinter <[email protected]> wrote:
> I know it's complicated, but scanning text is necessarily part of determining 
> which application/something+xml  you have.  I think (but should really check 
> before saying this) that XML media type registrations describe what the 
> DOCTYPE or XML namespace or root element are, and that, to properly "sniff" 
> them, you'd have to scan text. But before you scan text, you have to 
> determine charset.
>
> So if we're going to support sniffing of media types in general, I don't see 
> how we can do that without also specifying charset determination.
>
>
>
> Larry
> ]
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On 
> Behalf Of Adam Barth
> Sent: Sunday, October 23, 2011 8:28 PM
> To: Tobias Gondrom
> Cc: [email protected]
> Subject: Re: [websec] #22: content-type sniffing should include 
> charset sniffing
>
> The charset sniffing is also complicated by the fact that sometimes user 
> agents need to parse some of the HTML to find a <meta> element.
> In some situations, user agents need to restart the parsing algorithm, which 
> is quite delicate and better to describe in the same document as HTML parsing 
> (at least for use by HTML processing engines).
>
> Adam
>
>
> On Sun, Oct 23, 2011 at 8:24 PM, Tobias Gondrom <[email protected]> 
> wrote:
>> <hat="individual">
>> I tend not to agree with that.
>>
>> The fact that charset sniffing might happen at the same time as 
>> mime-sniffing does not seem like a strong argument to include this in 
>> the draft.
>>
>> Furthermore I would rather have these issues separate:
>> First you determine the content-type and then after that you may want 
>> to determine the charset used within that content-type (if you really 
>> have to sniff the charset). I can also imagine that charset sniffing 
>> algorithm might be depending on the application identified by the 
>> sniffed mime-type, which again would speak against throwing it in together 
>> with mime-sniffing....
>>
>> Kind regards, Tobias
>>
>>
>>
>> On 24/10/11 00:55, websec issue tracker wrote:
>>>
>>> #22: content-type sniffing should include charset sniffing
>>>
>>>  the HTML5 spec contains some algorithms for sniffing charset, 
>>> overriding
>>>  labeled charset, etc.
>>>
>>>  MIME parameters like charset are as much a part of the content-type 
>>> as the
>>>  base internet media type, and any sniffing of parameters and other
>>>  metadata (overriding content-type or guessing where it is not 
>>> supplied or
>>>  wrong) should be included in this document, since the sniffing will 
>>> happen
>>>  at the same time.
>>>
>>
>> _______________________________________________
>> websec mailing list
>> [email protected]
>> https://www.ietf.org/mailman/listinfo/websec
>>
> _______________________________________________
> websec mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/websec
>
_______________________________________________
websec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/websec

Reply via email to