Re: [whatwg] [mimesniff] Sniffing archives

2012-12-05 Thread Gordon P. Hemsley
(It seems I somehow managed to not send this to the list the first
time around. Addendum included.)

On Tue, Dec 4, 2012 at 2:40 AM, Adam Barth w...@adambarth.com wrote:
 On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote:
 On 2012-11-29 20:25, Adam Barth wrote:
 These are supported in Chrome.  That's what causes the download.  From

 Can you elaborate about what you mean by supported? Chrome sniffs for the
 type, and then offers to download as a result of that sniffing? How is that
 different from not sniffing in the first place?

 They might otherwise be treated as a type that can be displayed
 (rather than downloaded).

But isn't the whole point of the spec to eliminate such accidental
sniffing? Anything not explicitly sniffed based on the first bytes of
the file will be assumed to be either 'application/octet-stream' or
'text/plain', depending on whether there are binary bytes present.

The old IE behavior that you were investigating in your 2009 paper,
where you sniff beyond the first few bytes to find embedded HTML, is
eliminated with this sniffing algorithm. There is no case where you
would accidentally sniff something as scriptable, if you were
following the algorithm correctly.

Or am I missing something?

P.S.

Note also that I have previously defined what it means to be
supported by the user agent:

A valid media type is supported by the user agent if the user agent
has the capability to interpret a resource of that media type and
present it to the user.

http://mimesniff.spec.whatwg.org/#supported-by-the-user-agent

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-04 Thread Henri Sivonen
On Tue, Dec 4, 2012 at 9:40 AM, Adam Barth w...@adambarth.com wrote:
 Also, some user agents treat downloads of
 ZIP archives differently than other sorts of download (e.g., they
 might offer to unzip them).

Which user agents? For this use case, merely sniffing for the zip
magic number is inadequate, because you really don’t want to offer to
unzip EPUB, ODF, OOXML, XPS, InDesign, etc. files.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-04 Thread Adam Barth
On Mon, Dec 3, 2012 at 11:59 PM, Julian Reschke julian.resc...@gmx.de wrote:
 On 2012-12-04 08:40, Adam Barth wrote:
 On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de
 wrote:
 On 2012-11-29 20:25, Adam Barth wrote:
 These are supported in Chrome.  That's what causes the download.  From

 Can you elaborate about what you mean by supported? Chrome sniffs for
 the
 type, and then offers to download as a result of that sniffing? How is
 that
 different from not sniffing in the first place?

 They might otherwise be treated as a type that can be displayed
 (rather than downloaded).  Also, some user agents treat downloads of

 Do you have an example for that case?

 ZIP archives differently than other sorts of download (e.g., they
 might offer to unzip them).

 Out of curiosity: which?

Safari.

Adam


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-04 Thread Gordon P. Hemsley
On Tue, Dec 4, 2012 at 11:07 AM, Adam Barth w...@adambarth.com wrote:
 On Mon, Dec 3, 2012 at 11:59 PM, Julian Reschke julian.resc...@gmx.de wrote:
 On 2012-12-04 08:40, Adam Barth wrote:
 They might otherwise be treated as a type that can be displayed
 (rather than downloaded).  Also, some user agents treat downloads of

 Do you have an example for that case?

 ZIP archives differently than other sorts of download (e.g., they
 might offer to unzip them).

 Out of curiosity: which?

 Safari.

 Adam

To be more specific:

(1) Safari doesn't appear to prompt the user for any downloads. It
just automatically downloads any file it can't handle.
(2) If you allow Safari to open safe files that it downloads, ZIP
appears to be one of them. Gzip and RAR, however, do not.

So this isn't the most convincing argument.

-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-04 Thread Ian Hickson
On Tue, 4 Dec 2012, Gordon P. Hemsley wrote:

 To be more specific:
 
 (1) Safari doesn't appear to prompt the user for any downloads. It
 just automatically downloads any file it can't handle.
 (2) If you allow Safari to open safe files that it downloads, ZIP
 appears to be one of them. Gzip and RAR, however, do not.
 
 So this isn't the most convincing argument.

In particular, it doesn't seem like this needs to be defined in the MIME 
sniff spec. There's no harm in the browser sniffing more non-scripted 
types than the spec says, if it's just for labeling or handling at the OS 
level. It's when one browser handles something as completely safe and 
another handles something as live, or when a browser displays a file 
differently than another browser, that there's a problem, really.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-03 Thread Julian Reschke

On 2012-11-29 20:25, Adam Barth wrote:

These are supported in Chrome.  That's what causes the download.  From


Can you elaborate about what you mean by supported? Chrome sniffs for 
the type, and then offers to download as a result of that sniffing? How 
is that different from not sniffing in the first place?



...your comment, it's not clear to me if you are correctly reverse
engineering existing user agents.  The techniques we used to create
this list originally are quite sophisticated and involved a massive
amount of data [1].  It would be a shame if you destroyed that work
because you didn't understand it.

Adam

[1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf
...


Understood; but on the other hand if there's a chance to simplify things 
than it makes sense to discuss this, even if that would involve changing 
some of the implementations.


Best regards, Julian


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-03 Thread Adam Barth
On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote:
 On 2012-11-29 20:25, Adam Barth wrote:
 These are supported in Chrome.  That's what causes the download.  From

 Can you elaborate about what you mean by supported? Chrome sniffs for the
 type, and then offers to download as a result of that sniffing? How is that
 different from not sniffing in the first place?

They might otherwise be treated as a type that can be displayed
(rather than downloaded).  Also, some user agents treat downloads of
ZIP archives differently than other sorts of download (e.g., they
might offer to unzip them).

Adam


Re: [whatwg] [mimesniff] Sniffing archives

2012-12-03 Thread Julian Reschke

On 2012-12-04 08:40, Adam Barth wrote:

On Mon, Dec 3, 2012 at 12:39 PM, Julian Reschke julian.resc...@gmx.de wrote:

On 2012-11-29 20:25, Adam Barth wrote:

These are supported in Chrome.  That's what causes the download.  From


Can you elaborate about what you mean by supported? Chrome sniffs for the
type, and then offers to download as a result of that sniffing? How is that
different from not sniffing in the first place?


They might otherwise be treated as a type that can be displayed
(rather than downloaded).  Also, some user agents treat downloads of


Do you have an example for that case?


ZIP archives differently than other sorts of download (e.g., they
might offer to unzip them).


Out of curiosity: which?

Best regards, Julian



Re: [whatwg] [mimesniff] Sniffing archives

2012-11-29 Thread Gordon P. Hemsley
To be clear, I'm asking this because I would like to remove the
sniffing of archive types from the mimesniff spec if there aren't any
valid usecases.

On Wed, Nov 28, 2012 at 12:18 PM, Gordon P. Hemsley gphems...@gmail.com wrote:
 The mimesniff spec currently includes signatures for ZIP, gzip, and
 RAR archive formats. However, no major browser seems to support them
 natively (they all prompt for download), and it's not clear whether
 the type detection is a product of the browser code or the OS, or
 whether it is used beyond choosing an appropriate file extension for
 the download.

 Are there any valid usecases for explicitly sniffing archive formats
 instead of letting them default to application/octet-stream like other
 binary files would? Note that Henri Sivonen has previously raised the
 issue that ZIP-based formats (like office suite documents), for
 example, would be misleadingly sniffed as ZIP files, and there is no
 easy way around that.

 --
 Gordon P. Hemsley
 m...@gphemsley.org
 http://gphemsley.org/ • http://gphemsley.org/blog/



-- 
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] [mimesniff] Sniffing archives

2012-11-29 Thread Adam Barth
These are supported in Chrome.  That's what causes the download.  From
your comment, it's not clear to me if you are correctly reverse
engineering existing user agents.  The techniques we used to create
this list originally are quite sophisticated and involved a massive
amount of data [1].  It would be a shame if you destroyed that work
because you didn't understand it.

Adam

[1] http://www.adambarth.com/papers/2009/barth-caballero-song.pdf


On Thu, Nov 29, 2012 at 10:42 AM, Gordon P. Hemsley gphems...@gmail.com wrote:
 To be clear, I'm asking this because I would like to remove the
 sniffing of archive types from the mimesniff spec if there aren't any
 valid usecases.

 On Wed, Nov 28, 2012 at 12:18 PM, Gordon P. Hemsley gphems...@gmail.com 
 wrote:
 The mimesniff spec currently includes signatures for ZIP, gzip, and
 RAR archive formats. However, no major browser seems to support them
 natively (they all prompt for download), and it's not clear whether
 the type detection is a product of the browser code or the OS, or
 whether it is used beyond choosing an appropriate file extension for
 the download.

 Are there any valid usecases for explicitly sniffing archive formats
 instead of letting them default to application/octet-stream like other
 binary files would? Note that Henri Sivonen has previously raised the
 issue that ZIP-based formats (like office suite documents), for
 example, would be misleadingly sniffed as ZIP files, and there is no
 easy way around that.

 --
 Gordon P. Hemsley
 m...@gphemsley.org
 http://gphemsley.org/ • http://gphemsley.org/blog/



 --
 Gordon P. Hemsley
 m...@gphemsley.org
 http://gphemsley.org/ • http://gphemsley.org/blog/