On 24/10/11 04:21, Larry Masinter wrote:
- in which way is it more certain that there is no mislabeled PDF than a 
mislabeled jpg or mislabeled rtf?
I don't think this is relevant. There is likely mislabeled PDF. But I had 
specific feedback from implementors of PDF readers that sniffing from other 
content-type resulted in a worse situation than not sniffing. I don't have any 
information on jpg or rtf.

Sniffing should only be done when it is justified by an improved user 
experience over not sniffing.
<hat="individual">
Fine by me. The browsers and OS started sniffing for exactly that reason in the first place, to improve user experience.

The reason why I am asking so specifically about the reasons for not doing PDF sniffing is the following: In general I can imagine a number of scenarios where sniffing is disadvantageous (i.e. leads to security risks) for certain file types. The main threat with sniffing is it leads to false-positives being thrown into the application. Yet, it seems the browser vendors do so anyway.... - Which led us do this draft in the first place.

If we exclude one specific file-type from sniffing, there are two interesting points: 1. we should have a compelling explanation for the browsers/OS not to do so, so they will follow the RFC. 2. these reasons may likely also be true for other file-types. So looking at them, we might deduce that they hold true for other content-types as well. Which again would be very useful information.


I think the obligation of evidence is "opt in": we should only sniff content 
when there is evidence of mislabeled content for which sniffing actually improves 
something, and the improvement outweighs other considerations.

- what about scenarios in which there is no content-type (e.g. ftp, 
filesystem), should in this case sniffing not be done?
I didn't get any feedback on that. I don't know any workflows where valid PDF 
doesn't carry a file type label somehow (if only the file extension .pdf), so 
maybe sniffing based on file content itself doesn't matter.

((Maybe this is another issue? I just wonder if the algorithm for "no content-type" is 
the same, needs to be the same, as the algorithm for "content-type via HTTP".)

I can imagine that the cases "no content-type given" and "wrong content-type given" could be treated differently, but I am not sure about it.





Larry


_______________________________________________
websec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/websec

Reply via email to