Re: [whatwg] [mimesniff] Sniffing archives
(It seems I somehow managed to not send this to the list the first time around. Addendum included.) On Tue, Dec 4, 2012 at 2:40 AM, Adam Barth w...@adambarth.com wrote: On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? They might otherwise be treated as a type that can be displayed (rather than downloaded). But isn't the whole point of the spec to eliminate such accidental sniffing? Anything not explicitly sniffed based on the first bytes of the file will be assumed to be either 'application/octet-stream' or 'text/plain', depending on whether there are binary bytes present. The old IE behavior that you were investigating in your 2009 paper, where you sniff beyond the first few bytes to find embedded HTML, is eliminated with this sniffing algorithm. There is no case where you would accidentally sniff something as scriptable, if you were following the algorithm correctly. Or am I missing something? P.S. Note also that I have previously defined what it means to be supported by the user agent: A valid media type is supported by the user agent if the user agent has the capability to interpret a resource of that media type and present it to the user. http://mimesniff.spec.whatwg.org/#supported-by-the-user-agent -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
On Tue, Dec 4, 2012 at 9:40 AM, Adam Barth w...@adambarth.com wrote: Also, some user agents treat downloads of ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Which user agents? For this use case, merely sniffing for the zip magic number is inadequate, because you really don’t want to offer to unzip EPUB, ODF, OOXML, XPS, InDesign, etc. files. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] [mimesniff] Sniffing archives
On Mon, Dec 3, 2012 at 11:59 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-12-04 08:40, Adam Barth wrote: On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? They might otherwise be treated as a type that can be displayed (rather than downloaded). Also, some user agents treat downloads of Do you have an example for that case? ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Out of curiosity: which? Safari. Adam
Re: [whatwg] [mimesniff] Sniffing archives
On Tue, Dec 4, 2012 at 11:07 AM, Adam Barth w...@adambarth.com wrote: On Mon, Dec 3, 2012 at 11:59 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-12-04 08:40, Adam Barth wrote: They might otherwise be treated as a type that can be displayed (rather than downloaded). Also, some user agents treat downloads of Do you have an example for that case? ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Out of curiosity: which? Safari. Adam To be more specific: (1) Safari doesn't appear to prompt the user for any downloads. It just automatically downloads any file it can't handle. (2) If you allow Safari to open safe files that it downloads, ZIP appears to be one of them. Gzip and RAR, however, do not. So this isn't the most convincing argument. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
On Tue, 4 Dec 2012, Gordon P. Hemsley wrote: To be more specific: (1) Safari doesn't appear to prompt the user for any downloads. It just automatically downloads any file it can't handle. (2) If you allow Safari to open safe files that it downloads, ZIP appears to be one of them. Gzip and RAR, however, do not. So this isn't the most convincing argument. In particular, it doesn't seem like this needs to be defined in the MIME sniff spec. There's no harm in the browser sniffing more non-scripted types than the spec says, if it's just for labeling or handling at the OS level. It's when one browser handles something as completely safe and another handles something as live, or when a browser displays a file differently than another browser, that there's a problem, really. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] [mimesniff] Sniffing archives
On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? ...your comment, it's not clear to me if you are correctly reverse engineering existing user agents. The techniques we used to create this list originally are quite sophisticated and involved a massive amount of data [1]. It would be a shame if you destroyed that work because you didn't understand it. Adam [1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf ... Understood; but on the other hand if there's a chance to simplify things than it makes sense to discuss this, even if that would involve changing some of the implementations. Best regards, Julian
Re: [whatwg] [mimesniff] Sniffing archives
On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? They might otherwise be treated as a type that can be displayed (rather than downloaded). Also, some user agents treat downloads of ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Adam
Re: [whatwg] [mimesniff] Sniffing archives
On 2012-12-04 08:40, Adam Barth wrote: On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote: On 2012-11-29 20:25, Adam Barth wrote: These are supported in Chrome. That's what causes the download. From Can you elaborate about what you mean by supported? Chrome sniffs for the type, and then offers to download as a result of that sniffing? How is that different from not sniffing in the first place? They might otherwise be treated as a type that can be displayed (rather than downloaded). Also, some user agents treat downloads of Do you have an example for that case? ZIP archives differently than other sorts of download (e.g., they might offer to unzip them). Out of curiosity: which? Best regards, Julian
Re: [whatwg] [mimesniff] Sniffing archives
To be clear, I'm asking this because I would like to remove the sniffing of archive types from the mimesniff spec if there aren't any valid usecases. On Wed, Nov 28, 2012 at 12:18 PM, Gordon P. Hemsley gphems...@gmail.com wrote: The mimesniff spec currently includes signatures for ZIP, gzip, and RAR archive formats. However, no major browser seems to support them natively (they all prompt for download), and it's not clear whether the type detection is a product of the browser code or the OS, or whether it is used beyond choosing an appropriate file extension for the download. Are there any valid usecases for explicitly sniffing archive formats instead of letting them default to application/octet-stream like other binary files would? Note that Henri Sivonen has previously raised the issue that ZIP-based formats (like office suite documents), for example, would be misleadingly sniffed as ZIP files, and there is no easy way around that. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] [mimesniff] Sniffing archives
These are supported in Chrome. That's what causes the download. From your comment, it's not clear to me if you are correctly reverse engineering existing user agents. The techniques we used to create this list originally are quite sophisticated and involved a massive amount of data [1]. It would be a shame if you destroyed that work because you didn't understand it. Adam [1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf On Thu, Nov 29, 2012 at 10:42 AM, Gordon P. Hemsley gphems...@gmail.com wrote: To be clear, I'm asking this because I would like to remove the sniffing of archive types from the mimesniff spec if there aren't any valid usecases. On Wed, Nov 28, 2012 at 12:18 PM, Gordon P. Hemsley gphems...@gmail.com wrote: The mimesniff spec currently includes signatures for ZIP, gzip, and RAR archive formats. However, no major browser seems to support them natively (they all prompt for download), and it's not clear whether the type detection is a product of the browser code or the OS, or whether it is used beyond choosing an appropriate file extension for the download. Are there any valid usecases for explicitly sniffing archive formats instead of letting them default to application/octet-stream like other binary files would? Note that Henri Sivonen has previously raised the issue that ZIP-based formats (like office suite documents), for example, would be misleadingly sniffed as ZIP files, and there is no easy way around that. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/ -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/