Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Gordon P. Hemsley
On Tue, May 7, 2013 at 10:18 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/7/13 5:54 PM, Gordon P. Hemsley wrote:

 A @download attribute with a value would override both factors, like so:
 (1) Download it.
 (2) A.txt

 Why?

 You say this as if it were obvious, but it's not obvious to me at all...
 What's the reasoning that makes this the desirable behavior?

It's not clear to me which of the two factors you take issue with.

Here's what the spec says:

The download attribute, if present, indicates that the author intends
the hyperlink to be used for downloading a resource. The attribute may
have a value; the value, if any, specifies the default file name that
the author recommends for use in labeling the resource in a local file
system.

I interpret that first sentence to mean that the file should be
downloaded (disposition type = attachment) rather than displayed
(disposition type = inline). The second sentence very clearly suggests
that A.txt would be the filename presented to the user by default in
the save dialog.

 I don't see what the security concerns might be: There is no
 difference here than what is already available

 There is if you allow cross-origin @download.

 There is if you allow untrusted markup on your server and don't sanitize
 away @download (should it be sanitized away?  Unclear).

I'm still not seeing what the problem is. All this does is make the
browser treat the link as if the user followed it and then went File 
Save Page As

What are the security concerns, cross-origin or otherwise?

 AFAICT, there are no content
 sniffing or cross-domain issues at play.

 But there are; see above.

Well, what I should have said is, there is no content sniffing beyond
what is already done for regular page saves. (The UI can show the MIME
type or format of the file in the download box, as it would for any
file it doesn't handle natively.)

 results when saving a file; they don't do any file extension vs. file
 format checking.

 Uh... that depends on exactly how you save and your OS.  Browsers commonly
 do file extension vs MIME type checking on Windows.  Behavior on other OSes
 varies, and varies across browsers.

 -Boris

Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a
while since I used Windows. But I'd be surprised to find out that the
browser (Firefox, in the case I have in mind) changes the extension in
the suggested filename (e.g. example.php for an HTML file) on
Windows but not on Mac, and I would argue that that perhaps should not
be the case.

--
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Boris Zbarsky

On 5/8/13 6:53 AM, Gordon P. Hemsley wrote:

It's not clear to me which of the two factors you take issue with.


The question of which filename takes priority.


I interpret that first sentence to mean that the file should be
downloaded (disposition type = attachment) rather than displayed


Yes.


The second sentence very clearly suggests
that A.txt would be the filename presented to the user by default in
the save dialog.


No, it suggests that A.txt is what the page author recommends.

If, at the same time, B.txt is what the server author recommends, what 
should happen?



There is if you allow cross-origin @download.

There is if you allow untrusted markup on your server and don't sanitize
away @download (should it be sanitized away?  Unclear).


I'm still not seeing what the problem is. All this does is make the
browser treat the link as if the user followed it and then went File 
Save Page As


No, because in that case the browser will definitely use the 
Content-Disposition filename, not the one from @download.



What are the security concerns, cross-origin or otherwise?


One concern is being able to do this:

  a download=known-location.pdf
 href=http://some-bank/statement.pdf;

cross-site and combining it with something that lets you read 
known-location.pdf (e.g. a file://-specific privacy hole that only 
applies to some filenames, or an input type=file that the user has 
already filled in).


Another concern is if you upload a file to an image-sharing site, but it 
happens to be a Windows executable.  Then you link to it with:


  a download=something.exe href=http://image-sharing-site/whatever;

and wait for the user to download and double-click.  This relies on the 
user thinking the file came from image-sharing-site so must be an image. 
 UAs may do mitigations here by changing the suggested filename, of course.


Generally, allowing this sort of thing opens up several new phishing nd 
social engineering attack vectors, and it's not clear that we want that.



Well, what I should have said is, there is no content sniffing beyond
what is already done for regular page saves. (The UI can show the MIME
type or format of the file in the download box, as it would for any
file it doesn't handle natively.)


It can, and users routinely ignore that.


Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a
while since I used Windows. But I'd be surprised to find out that the
browser (Firefox, in the case I have in mind) changes the extension in
the suggested filename (e.g. example.php for an HTML file) on
Windows but not on Mac


It sure used to in some cases, partially in concert with the Windows 
filepicker.  See the (scant) documention for lpstrDefExt at 
http://msdn.microsoft.com/en-us/library/windows/desktop/ms646839%28v=vs.85%29.aspx 
and I suggest actually doing some experimentation across the different 
save variants (save image, save link as, save page as, click on 
something with content-disposition:attacment) on several OSes to see the 
behavior.  There is certainly a good bit of code in the various 
file-saving codepaths in Firefox that attempts to ensure extensions 
match MIME types, to forbid saving things with certain extensions, etc.


Also note that Chrome will change extensions on at least @download 
filenames to match the MIME type; I haven't experimented in detail with 
its behavior for other cases.  And I haven't experimented much with 
other browsers in this area, though I expect all have some interesting 
behavior.


-Boris


Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Gordon P. Hemsley
On Wed, May 8, 2013 at 9:43 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/8/13 6:53 AM, Gordon P. Hemsley wrote:

 It's not clear to me which of the two factors you take issue with.


 The question of which filename takes priority.


 The second sentence very clearly suggests
 that A.txt would be the filename presented to the user by default in
 the save dialog.


 No, it suggests that A.txt is what the page author recommends.

 If, at the same time, B.txt is what the server author recommends, what
 should happen?

I still think @download takes priority.

The Content-Disposition header says, Nevermind what filename the URL
shows; this is really file B.txt.

The @download attribute says, Nevermind what filename this link would
normally be; let's just consider it A.txt.

 There is if you allow cross-origin @download.

 There is if you allow untrusted markup on your server and don't sanitize
 away @download (should it be sanitized away?  Unclear).


 I'm still not seeing what the problem is. All this does is make the
 browser treat the link as if the user followed it and then went File 
 Save Page As


 No, because in that case the browser will definitely use the
 Content-Disposition filename, not the one from @download.

OK, technically, the way I phrased it, yes. But what I meant was that
it rolls a bunch of steps into one, telling the browser that the link
should be downloaded and named per suggestion.

 What are the security concerns, cross-origin or otherwise?


 One concern is being able to do this:

   a download=known-location.pdf
  href=http://some-bank/statement.pdf;

 cross-site and combining it with something that lets you read
 known-location.pdf (e.g. a file://-specific privacy hole that only applies
 to some filenames, or an input type=file that the user has already filled
 in).

That seems like quite a sophisticated attack that relies on a lot of
things falling into place all at once. I'm not sure that should block
the use of the attribute in and of itself.

 Another concern is if you upload a file to an image-sharing site, but it
 happens to be a Windows executable.  Then you link to it with:

   a download=something.exe href=http://image-sharing-site/whatever;

 and wait for the user to download and double-click.  This relies on the user
 thinking the file came from image-sharing-site so must be an image.  UAs may
 do mitigations here by changing the suggested filename, of course.

Then I think it is the responsibility of the UA to sniff the file and
protect the user from such attempts to mislead.

At the very least, the download UI could specify the actual type of
the file that is being downloaded. (More on how to protect users who
don't read that below.)

 Generally, allowing this sort of thing opens up several new phishing nd
 social engineering attack vectors, and it's not clear that we want that.

There is a price to freedom, as they say. We shouldn't let a few
rotten apples spoil the whole bunch.

 Well, what I should have said is, there is no content sniffing beyond
 what is already done for regular page saves. (The UI can show the MIME
 type or format of the file in the download box, as it would for any
 file it doesn't handle natively.)


 It can, and users routinely ignore that.


 Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a
 while since I used Windows. But I'd be surprised to find out that the
 browser (Firefox, in the case I have in mind) changes the extension in
 the suggested filename (e.g. example.php for an HTML file) on
 Windows but not on Mac


 It sure used to in some cases, partially in concert with the Windows
 filepicker.  See the (scant) documention for lpstrDefExt at
 http://msdn.microsoft.com/en-us/library/windows/desktop/ms646839%28v=vs.85%29.aspx
 and I suggest actually doing some experimentation across the different save
 variants (save image, save link as, save page as, click on something with
 content-disposition:attacment) on several OSes to see the behavior.  There
 is certainly a good bit of code in the various file-saving codepaths in
 Firefox that attempts to ensure extensions match MIME types, to forbid
 saving things with certain extensions, etc.

 Also note that Chrome will change extensions on at least @download filenames
 to match the MIME type; I haven't experimented in detail with its behavior
 for other cases.  And I haven't experimented much with other browsers in
 this area, though I expect all have some interesting behavior.

 -Boris

I'm not sure I have the resources to do extensive real-world testing
of this (and that documentation suggests it has been superseded in
more modern OSes), but I don't think it would be unreasonable for the
UA to override or augment the filename suggested by the @download
attribute it if determines that it would not be in the best interest
of the user to use the suggested filename unchanged. Note that the
spec also says: There are no restrictions on allowed values, but
authors are 

Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Boris Zbarsky

On 5/8/13 10:45 AM, Gordon P. Hemsley wrote:

I still think @download takes priority.

The Content-Disposition header says, Nevermind what filename the URL
shows; this is really file B.txt.

The @download attribute says, Nevermind what filename this link would
normally be; let's just consider it A.txt.


OK, that's at least a reasonable argument for the behavior.  ;)


OK, technically, the way I phrased it, yes. But what I meant was that
it rolls a bunch of steps into one, telling the browser that the link
should be downloaded and named per suggestion.


Yes, but the key is _who_ is making the suggestion and why.


That seems like quite a sophisticated attack that relies on a lot of
things falling into place all at once.


Uh... yes.  Like most browser exploits.


Then I think it is the responsibility of the UA to sniff the file and
protect the user from such attempts to mislead.


This is not trivial, since sniffing can easily fail on files that are 
both HTML and png or both HTML and exe at the same time.  There's a good 
bit of research on things like this.



There is a price to freedom, as they say. We shouldn't let a few
rotten apples spoil the whole bunch.


If it's going to open users to exploits we do it all the time.


I'm not sure I have the resources to do extensive real-world testing
of this (and that documentation suggests it has been superseded in
more modern OSes), but I don't think it would be unreasonable for the
UA to override or augment the filename suggested by the @download
attribute it if determines that it would not be in the best interest
of the user to use the suggested filename unchanged.


Phrased that way, using the Content-Disposition filename is a perfectly 
valid override if not in the best interest of the user behavior, fwiw.


-Boris



Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Gordon P. Hemsley
On Wed, May 8, 2013 at 12:01 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/8/13 10:45 AM, Gordon P. Hemsley wrote:

 I still think @download takes priority.

 The Content-Disposition header says, Nevermind what filename the URL
 shows; this is really file B.txt.

 The @download attribute says, Nevermind what filename this link would
 normally be; let's just consider it A.txt.


 OK, that's at least a reasonable argument for the behavior.  ;)


 That seems like quite a sophisticated attack that relies on a lot of
 things falling into place all at once.


 Uh... yes.  Like most browser exploits.

Perhaps. But maybe I'm not clear on what exactly the alternate
proposal is. Are you suggesting not supporting the @download
attribute? Or just ignoring it when Content-Disposition specifies a
filename? (I would suggest that neither is the appropriate response.)

 Then I think it is the responsibility of the UA to sniff the file and
 protect the user from such attempts to mislead.


 This is not trivial, since sniffing can easily fail on files that are both
 HTML and png or both HTML and exe at the same time.  There's a good bit of
 research on things like this.

Yes, and that research has already gone into creating the mimesniff
standard, has it not? I'm suggesting use the existing algoirthm(s) in
an additional arena, not creating a new, separate algorithm.

If a file from an image sharing site is served as (or determined to
be, via the sniffing algorithms) image/png, for example, then the UA
should suggest a filename with a .png extension, ignoring any
suggestion by the author for a .exe extension. (Whether you want to
change it to A.png or A.exe.png is debatable, I suppose.)

 I'm not sure I have the resources to do extensive real-world testing
 of this (and that documentation suggests it has been superseded in
 more modern OSes), but I don't think it would be unreasonable for the
 UA to override or augment the filename suggested by the @download
 attribute it if determines that it would not be in the best interest
 of the user to use the suggested filename unchanged.


 Phrased that way, using the Content-Disposition filename is a perfectly
 valid override if not in the best interest of the user behavior, fwiw.

 -Boris


True. But doesn't that imply a rejection of my aforementioned
reasonable argument?

--
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Boris Zbarsky

On 5/8/13 12:15 PM, Gordon P. Hemsley wrote:

Perhaps. But maybe I'm not clear on what exactly the alternate
proposal is. Are you suggesting not supporting the @download
attribute? Or just ignoring it when Content-Disposition specifies a
filename? (I would suggest that neither is the appropriate response.)


What Gecko implements right now is:

1)  @download is ignored for non-same-origin links.
2)  If Content-Disposition specifies a filename, that filename is used
no matter what @download says.


This is not trivial, since sniffing can easily fail on files that are both
HTML and png or both HTML and exe at the same time.  There's a good bit of
research on things like this.


Yes, and that research has already gone into creating the mimesniff
standard, has it not? I'm suggesting use the existing algoirthm(s) in
an additional arena, not creating a new, separate algorithm.


The mimesniff standard doesn't try to sniff for types UAs don't render 
natively, which is what would be needed here.



True. But doesn't that imply a rejection of my aforementioned
reasonable argument?


Yes, it does.  reasonable means it's reasonable, not that it overrides 
all other considerations...


-Boris



Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Gordon P. Hemsley
On Wed, May 8, 2013 at 12:21 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/8/13 12:15 PM, Gordon P. Hemsley wrote:

 Perhaps. But maybe I'm not clear on what exactly the alternate
 proposal is. Are you suggesting not supporting the @download
 attribute? Or just ignoring it when Content-Disposition specifies a
 filename? (I would suggest that neither is the appropriate response.)


 What Gecko implements right now is:

 1)  @download is ignored for non-same-origin links.
 2)  If Content-Disposition specifies a filename, that filename is used
 no matter what @download says.

I understand now the motivation for this, but I would think that it
would remove a lot of the usefulness of the @download attribute: If
you have the same origin, you probably already have access to (a) name
the file appropriately in the first place, or (b) set the
Content-Disposition header to send the appropriate filename. No?

 This is not trivial, since sniffing can easily fail on files that are
 both
 HTML and png or both HTML and exe at the same time.  There's a good bit
 of
 research on things like this.


 Yes, and that research has already gone into creating the mimesniff
 standard, has it not? I'm suggesting use the existing algoirthm(s) in
 an additional arena, not creating a new, separate algorithm.


 The mimesniff standard doesn't try to sniff for types UAs don't render
 natively, which is what would be needed here.

I'm not so sure about that, but I'll leave it to someone else to
argue. (If you determine a file to be a PNG, then you suggest a .png
extension, regardless of whether there might be an embedded
executable; if you don't support the file format, then how do you know
that it isn't supposed to be an executable in the first place? —and
what is it doing on the Web?)

--
Gordon P. Hemsley
m...@gphemsley.org
http://gphemsley.org/ • http://gphemsley.org/blog/


Re: [whatwg] Priority between a download and content-disposition

2013-05-08 Thread Boris Zbarsky

On 5/8/13 12:37 PM, Gordon P. Hemsley wrote:

I understand now the motivation for this, but I would think that it
would remove a lot of the usefulness of the @download attribute


You're right, but we haven't found another mitigation for our security 
concerns.



If you have the same origin, you probably already have access to (a) name
the file appropriately in the first place, or (b) set the
Content-Disposition header to send the appropriate filename. No?


For files, not for things like data: and blob:, which were the primary 
motivation for @download.


That said, there are lots of cases in which someone can upload files but 
not pick the filename on the server or control the headers...



I'm not so sure about that, but I'll leave it to someone else to
argue. (If you determine a file to be a PNG, then you suggest a .png
extension, regardless of whether there might be an embedded
executable; if you don't support the file format, then how do you know
that it isn't supposed to be an executable in the first place? —and
what is it doing on the Web?)


I assume that last question is a joke, yes?  ;)

-Boris


Re: [whatwg] Notifications: reviving Notification objects

2013-05-08 Thread Anne van Kesteren
On Sun, Mar 31, 2013 at 7:33 AM, Anne van Kesteren ann...@annevk.nl wrote:
 2) Define a method on Navigator, getNotifications(), that returns a
 Future which is resolved with an array of Notification objects. Once
 the Future is resolved, a task is queued to fire a click event on the
 appropriate Notification object in case of B) and C).

I guess if we're getting system messages (as suggested on
public-script-coord), that could also be used here and less likely to
create timing issues.


 There are some further gotchas here. One is how Notifications objects
 should be scoped (and thus what exactly getNotifications()'s Future is
 resolved with). Origin-scoped would be extremely convenient, but it
 seems we still have efforts that support further scoping for
 applications based on the URL path.

I think we should go with origin-scoped until documents get some way
to associate themselves with an app concept. Maybe the manifest idea
that is floating around?


 The other is how much we need to expose on the Navigation object
 itself so sites can identify it after reviving it. Currently almost
 nothing is exposed, but maybe we should simply expose
 dir/lang/title/body/tag/icon on it.

Opinions?


--
http://annevankesteren.nl/


Re: [whatwg] API to delay the document load event (continued)

2013-05-08 Thread James Burke
On Mon, May 6, 2013 at 2:17 PM, David Bruant bruan...@gmail.com wrote:
 Le 06/05/2013 21:35, James Burke a écrit :
 Just going on my experience (admittedly a limited data set): anything
 that actually binds to document load really wants to know when all
 resources loaded (images/iframes) and page is considered complete,
 which fits with the motivations of this new capability.

 An app could be considered complete before the UA load event (hidden iframe
 hasn't finished loading, below-the-fold images haven't fully loaded yet,
 etc.)
 Delaying the load event doesn't take that into account.

That is fine. If they wanted to signal complete for those purposes,
they could delay that other work until the load event fires.

 If the concern is about an async script erroring out between the
 paired calls/addition and removal of an attribute, then perhaps any
 uncaught error ends up triggering the same behavior that occurs now
 when there is an error during onload determination.

 In case a component fails to notice it's ready, having the app readiness
 event separated from the UA load event would allow outsiders to use the UA
 load as fallback (which is the current best approximation).

load could fail with a long requested image too. I don't see this
as a strong argument for a separate event.

I'm not opposed to a different event, but I also do not feel like
there have been strong cases that point to it being a separate event
either.

By using load, I expect most code that is outside the app would not
need to change, as the observers of this state are likely already
using 'load', and it fits in with the definition of 'load'. It is just
that the platform cannot expect to detect the state on its own now
give async startup approaches and JS-generated HTML.

If there is really a case that falls down by delaying the load event
it would be good to know. That would point strongly to using a
different event.

James


Re: [whatwg] API to delay the document load event (continued)

2013-05-08 Thread James Burke
On Tue, May 7, 2013 at 3:12 PM, Bjoern Hoehrmann derhoe...@gmx.net wrote:
 * James Burke wrote:
I just joined the mailing list, so I apologize for not continuing the
existing thread started here:

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2013-April/039422.html

Disclaimer: I submitted the Mozilla Bugzilla ticket for some kind of
capability in this area.

Summarizing previous discussion points:

 I think it would be helpful if you could phrase these in terms of what
 various implementations should do. For instance, Google shows screen-
 shots in search results. How should their take a snapshot bot work? I
 maintain IECapt and CutyCapt; how should they be changed to support the
 feature being proposed here? Same question for the Firefox and Firefox-
 OS features that motivate creating a new feature here.

If this feature uses the load' event, as long as those tools listen
to load, that is enough: when they receive that event, the page
should be rendered in the state the web site developer wanted.

If a use case is identified that makes it difficult to use the
existing load event, then the tools would need to listen to a new
event. They would also need to inspect the document state (maybe by
checking for a loading attribute on the documentElement?) to know if
this new event is in play.

 Similarily, it would be helpful to approach the problem from the per-
 spective of content creators. Let's say you have a website, and any new
 visitor gets to see an overlay that encourages them to sign up with
 Acme. When would this site signal that it is ready for the purposes
 I've mentioned? And would all of the implementations cited above wait
 for this signal?

If the load event approach was used, the web site would:

* call document.delayLoadEvent() during JS execution (needs to happen
before the browser would normally trigger the load event).
* Once the Acme overlay DOM element was inserted, call
document.stopDelayingLoadEvent()
* The platform waits for any images/resources for the current DOM to
finish loading. It then fires the normal document load event.

For tools that are listening:

* Just listen for the load event for the document.

---

If a separate event/trigger instead of load is used, then I am not
exactly sure what is needed. One guess:

For the web site:

* stamp the DOM in some way to indicate this new event would be fired,
perhaps add the loading attribute to the documentElement.
* insert the Acme overlay DOM element, then remove the attribute.
* The platform waits for any images/resources for the current DOM to
finish loading (?) then fire appload event?

For tools that are listening:

* Inspect the state of the page, looking to see if there is a
loading attribute in play. Listen for both load and appload, and
if the loading attribute was detected, wait for the appload event.

In fact, the tool needs to discount load if the page detects
appload is in play as it may fire. (So, in response to one of
David's comments, trying to fallback to load would not be
feasible/detectable).

James


[whatwg] HTML5 is broken: menuitem causes infinite loop

2013-05-08 Thread Michael Day
Hilarious spec bug of the week: HTML5 requires implementations to loop 
indefinitely if they see a menuitem start tag.


12.2.5.4.7 in body insertion mode
 = see a menuitem start tag, process using rules for in head

12.2.5.4.4 in head insertion mode
 = see menuitem, act as if /head and reprocess

12.2.5.4.6 after head insertion mode
 = see menuitem, act as if body and reprocess

...and we're back at in body insertion mode, and will continue to 
bounce around with the menuitem start tag token making absolutely no 
progress whatsoever.


What is the menuitem tag supposed to be, anyway? A test to ensure that 
implementers are awake, like the /sarcasm close tag?


Cheers,

Michael