Re: [whatwg] Priority between a download and content-disposition
On Tue, May 7, 2013 at 10:18 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/7/13 5:54 PM, Gordon P. Hemsley wrote: A @download attribute with a value would override both factors, like so: (1) Download it. (2) A.txt Why? You say this as if it were obvious, but it's not obvious to me at all... What's the reasoning that makes this the desirable behavior? It's not clear to me which of the two factors you take issue with. Here's what the spec says: The download attribute, if present, indicates that the author intends the hyperlink to be used for downloading a resource. The attribute may have a value; the value, if any, specifies the default file name that the author recommends for use in labeling the resource in a local file system. I interpret that first sentence to mean that the file should be downloaded (disposition type = attachment) rather than displayed (disposition type = inline). The second sentence very clearly suggests that A.txt would be the filename presented to the user by default in the save dialog. I don't see what the security concerns might be: There is no difference here than what is already available There is if you allow cross-origin @download. There is if you allow untrusted markup on your server and don't sanitize away @download (should it be sanitized away? Unclear). I'm still not seeing what the problem is. All this does is make the browser treat the link as if the user followed it and then went File Save Page As What are the security concerns, cross-origin or otherwise? AFAICT, there are no content sniffing or cross-domain issues at play. But there are; see above. Well, what I should have said is, there is no content sniffing beyond what is already done for regular page saves. (The UI can show the MIME type or format of the file in the download box, as it would for any file it doesn't handle natively.) results when saving a file; they don't do any file extension vs. file format checking. Uh... that depends on exactly how you save and your OS. Browsers commonly do file extension vs MIME type checking on Windows. Behavior on other OSes varies, and varies across browsers. -Boris Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a while since I used Windows. But I'd be surprised to find out that the browser (Firefox, in the case I have in mind) changes the extension in the suggested filename (e.g. example.php for an HTML file) on Windows but not on Mac, and I would argue that that perhaps should not be the case. -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between a download and content-disposition
On 5/8/13 6:53 AM, Gordon P. Hemsley wrote: It's not clear to me which of the two factors you take issue with. The question of which filename takes priority. I interpret that first sentence to mean that the file should be downloaded (disposition type = attachment) rather than displayed Yes. The second sentence very clearly suggests that A.txt would be the filename presented to the user by default in the save dialog. No, it suggests that A.txt is what the page author recommends. If, at the same time, B.txt is what the server author recommends, what should happen? There is if you allow cross-origin @download. There is if you allow untrusted markup on your server and don't sanitize away @download (should it be sanitized away? Unclear). I'm still not seeing what the problem is. All this does is make the browser treat the link as if the user followed it and then went File Save Page As No, because in that case the browser will definitely use the Content-Disposition filename, not the one from @download. What are the security concerns, cross-origin or otherwise? One concern is being able to do this: a download=known-location.pdf href=http://some-bank/statement.pdf; cross-site and combining it with something that lets you read known-location.pdf (e.g. a file://-specific privacy hole that only applies to some filenames, or an input type=file that the user has already filled in). Another concern is if you upload a file to an image-sharing site, but it happens to be a Windows executable. Then you link to it with: a download=something.exe href=http://image-sharing-site/whatever; and wait for the user to download and double-click. This relies on the user thinking the file came from image-sharing-site so must be an image. UAs may do mitigations here by changing the suggested filename, of course. Generally, allowing this sort of thing opens up several new phishing nd social engineering attack vectors, and it's not clear that we want that. Well, what I should have said is, there is no content sniffing beyond what is already done for regular page saves. (The UI can show the MIME type or format of the file in the download box, as it would for any file it doesn't handle natively.) It can, and users routinely ignore that. Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a while since I used Windows. But I'd be surprised to find out that the browser (Firefox, in the case I have in mind) changes the extension in the suggested filename (e.g. example.php for an HTML file) on Windows but not on Mac It sure used to in some cases, partially in concert with the Windows filepicker. See the (scant) documention for lpstrDefExt at http://msdn.microsoft.com/en-us/library/windows/desktop/ms646839%28v=vs.85%29.aspx and I suggest actually doing some experimentation across the different save variants (save image, save link as, save page as, click on something with content-disposition:attacment) on several OSes to see the behavior. There is certainly a good bit of code in the various file-saving codepaths in Firefox that attempts to ensure extensions match MIME types, to forbid saving things with certain extensions, etc. Also note that Chrome will change extensions on at least @download filenames to match the MIME type; I haven't experimented in detail with its behavior for other cases. And I haven't experimented much with other browsers in this area, though I expect all have some interesting behavior. -Boris
Re: [whatwg] Priority between a download and content-disposition
On Wed, May 8, 2013 at 9:43 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/8/13 6:53 AM, Gordon P. Hemsley wrote: It's not clear to me which of the two factors you take issue with. The question of which filename takes priority. The second sentence very clearly suggests that A.txt would be the filename presented to the user by default in the save dialog. No, it suggests that A.txt is what the page author recommends. If, at the same time, B.txt is what the server author recommends, what should happen? I still think @download takes priority. The Content-Disposition header says, Nevermind what filename the URL shows; this is really file B.txt. The @download attribute says, Nevermind what filename this link would normally be; let's just consider it A.txt. There is if you allow cross-origin @download. There is if you allow untrusted markup on your server and don't sanitize away @download (should it be sanitized away? Unclear). I'm still not seeing what the problem is. All this does is make the browser treat the link as if the user followed it and then went File Save Page As No, because in that case the browser will definitely use the Content-Disposition filename, not the one from @download. OK, technically, the way I phrased it, yes. But what I meant was that it rolls a bunch of steps into one, telling the browser that the link should be downloaded and named per suggestion. What are the security concerns, cross-origin or otherwise? One concern is being able to do this: a download=known-location.pdf href=http://some-bank/statement.pdf; cross-site and combining it with something that lets you read known-location.pdf (e.g. a file://-specific privacy hole that only applies to some filenames, or an input type=file that the user has already filled in). That seems like quite a sophisticated attack that relies on a lot of things falling into place all at once. I'm not sure that should block the use of the attribute in and of itself. Another concern is if you upload a file to an image-sharing site, but it happens to be a Windows executable. Then you link to it with: a download=something.exe href=http://image-sharing-site/whatever; and wait for the user to download and double-click. This relies on the user thinking the file came from image-sharing-site so must be an image. UAs may do mitigations here by changing the suggested filename, of course. Then I think it is the responsibility of the UA to sniff the file and protect the user from such attempts to mislead. At the very least, the download UI could specify the actual type of the file that is being downloaded. (More on how to protect users who don't read that below.) Generally, allowing this sort of thing opens up several new phishing nd social engineering attack vectors, and it's not clear that we want that. There is a price to freedom, as they say. We shouldn't let a few rotten apples spoil the whole bunch. Well, what I should have said is, there is no content sniffing beyond what is already done for regular page saves. (The UI can show the MIME type or format of the file in the download box, as it would for any file it doesn't handle natively.) It can, and users routinely ignore that. Ah, I admit, I'm a bit biased towards Mac in that regard. It's been a while since I used Windows. But I'd be surprised to find out that the browser (Firefox, in the case I have in mind) changes the extension in the suggested filename (e.g. example.php for an HTML file) on Windows but not on Mac It sure used to in some cases, partially in concert with the Windows filepicker. See the (scant) documention for lpstrDefExt at http://msdn.microsoft.com/en-us/library/windows/desktop/ms646839%28v=vs.85%29.aspx and I suggest actually doing some experimentation across the different save variants (save image, save link as, save page as, click on something with content-disposition:attacment) on several OSes to see the behavior. There is certainly a good bit of code in the various file-saving codepaths in Firefox that attempts to ensure extensions match MIME types, to forbid saving things with certain extensions, etc. Also note that Chrome will change extensions on at least @download filenames to match the MIME type; I haven't experimented in detail with its behavior for other cases. And I haven't experimented much with other browsers in this area, though I expect all have some interesting behavior. -Boris I'm not sure I have the resources to do extensive real-world testing of this (and that documentation suggests it has been superseded in more modern OSes), but I don't think it would be unreasonable for the UA to override or augment the filename suggested by the @download attribute it if determines that it would not be in the best interest of the user to use the suggested filename unchanged. Note that the spec also says: There are no restrictions on allowed values, but authors are
Re: [whatwg] Priority between a download and content-disposition
On 5/8/13 10:45 AM, Gordon P. Hemsley wrote: I still think @download takes priority. The Content-Disposition header says, Nevermind what filename the URL shows; this is really file B.txt. The @download attribute says, Nevermind what filename this link would normally be; let's just consider it A.txt. OK, that's at least a reasonable argument for the behavior. ;) OK, technically, the way I phrased it, yes. But what I meant was that it rolls a bunch of steps into one, telling the browser that the link should be downloaded and named per suggestion. Yes, but the key is _who_ is making the suggestion and why. That seems like quite a sophisticated attack that relies on a lot of things falling into place all at once. Uh... yes. Like most browser exploits. Then I think it is the responsibility of the UA to sniff the file and protect the user from such attempts to mislead. This is not trivial, since sniffing can easily fail on files that are both HTML and png or both HTML and exe at the same time. There's a good bit of research on things like this. There is a price to freedom, as they say. We shouldn't let a few rotten apples spoil the whole bunch. If it's going to open users to exploits we do it all the time. I'm not sure I have the resources to do extensive real-world testing of this (and that documentation suggests it has been superseded in more modern OSes), but I don't think it would be unreasonable for the UA to override or augment the filename suggested by the @download attribute it if determines that it would not be in the best interest of the user to use the suggested filename unchanged. Phrased that way, using the Content-Disposition filename is a perfectly valid override if not in the best interest of the user behavior, fwiw. -Boris
Re: [whatwg] Priority between a download and content-disposition
On Wed, May 8, 2013 at 12:01 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/8/13 10:45 AM, Gordon P. Hemsley wrote: I still think @download takes priority. The Content-Disposition header says, Nevermind what filename the URL shows; this is really file B.txt. The @download attribute says, Nevermind what filename this link would normally be; let's just consider it A.txt. OK, that's at least a reasonable argument for the behavior. ;) That seems like quite a sophisticated attack that relies on a lot of things falling into place all at once. Uh... yes. Like most browser exploits. Perhaps. But maybe I'm not clear on what exactly the alternate proposal is. Are you suggesting not supporting the @download attribute? Or just ignoring it when Content-Disposition specifies a filename? (I would suggest that neither is the appropriate response.) Then I think it is the responsibility of the UA to sniff the file and protect the user from such attempts to mislead. This is not trivial, since sniffing can easily fail on files that are both HTML and png or both HTML and exe at the same time. There's a good bit of research on things like this. Yes, and that research has already gone into creating the mimesniff standard, has it not? I'm suggesting use the existing algoirthm(s) in an additional arena, not creating a new, separate algorithm. If a file from an image sharing site is served as (or determined to be, via the sniffing algorithms) image/png, for example, then the UA should suggest a filename with a .png extension, ignoring any suggestion by the author for a .exe extension. (Whether you want to change it to A.png or A.exe.png is debatable, I suppose.) I'm not sure I have the resources to do extensive real-world testing of this (and that documentation suggests it has been superseded in more modern OSes), but I don't think it would be unreasonable for the UA to override or augment the filename suggested by the @download attribute it if determines that it would not be in the best interest of the user to use the suggested filename unchanged. Phrased that way, using the Content-Disposition filename is a perfectly valid override if not in the best interest of the user behavior, fwiw. -Boris True. But doesn't that imply a rejection of my aforementioned reasonable argument? -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between a download and content-disposition
On 5/8/13 12:15 PM, Gordon P. Hemsley wrote: Perhaps. But maybe I'm not clear on what exactly the alternate proposal is. Are you suggesting not supporting the @download attribute? Or just ignoring it when Content-Disposition specifies a filename? (I would suggest that neither is the appropriate response.) What Gecko implements right now is: 1) @download is ignored for non-same-origin links. 2) If Content-Disposition specifies a filename, that filename is used no matter what @download says. This is not trivial, since sniffing can easily fail on files that are both HTML and png or both HTML and exe at the same time. There's a good bit of research on things like this. Yes, and that research has already gone into creating the mimesniff standard, has it not? I'm suggesting use the existing algoirthm(s) in an additional arena, not creating a new, separate algorithm. The mimesniff standard doesn't try to sniff for types UAs don't render natively, which is what would be needed here. True. But doesn't that imply a rejection of my aforementioned reasonable argument? Yes, it does. reasonable means it's reasonable, not that it overrides all other considerations... -Boris
Re: [whatwg] Priority between a download and content-disposition
On Wed, May 8, 2013 at 12:21 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/8/13 12:15 PM, Gordon P. Hemsley wrote: Perhaps. But maybe I'm not clear on what exactly the alternate proposal is. Are you suggesting not supporting the @download attribute? Or just ignoring it when Content-Disposition specifies a filename? (I would suggest that neither is the appropriate response.) What Gecko implements right now is: 1) @download is ignored for non-same-origin links. 2) If Content-Disposition specifies a filename, that filename is used no matter what @download says. I understand now the motivation for this, but I would think that it would remove a lot of the usefulness of the @download attribute: If you have the same origin, you probably already have access to (a) name the file appropriately in the first place, or (b) set the Content-Disposition header to send the appropriate filename. No? This is not trivial, since sniffing can easily fail on files that are both HTML and png or both HTML and exe at the same time. There's a good bit of research on things like this. Yes, and that research has already gone into creating the mimesniff standard, has it not? I'm suggesting use the existing algoirthm(s) in an additional arena, not creating a new, separate algorithm. The mimesniff standard doesn't try to sniff for types UAs don't render natively, which is what would be needed here. I'm not so sure about that, but I'll leave it to someone else to argue. (If you determine a file to be a PNG, then you suggest a .png extension, regardless of whether there might be an embedded executable; if you don't support the file format, then how do you know that it isn't supposed to be an executable in the first place? —and what is it doing on the Web?) -- Gordon P. Hemsley m...@gphemsley.org http://gphemsley.org/ • http://gphemsley.org/blog/
Re: [whatwg] Priority between a download and content-disposition
On 5/8/13 12:37 PM, Gordon P. Hemsley wrote: I understand now the motivation for this, but I would think that it would remove a lot of the usefulness of the @download attribute You're right, but we haven't found another mitigation for our security concerns. If you have the same origin, you probably already have access to (a) name the file appropriately in the first place, or (b) set the Content-Disposition header to send the appropriate filename. No? For files, not for things like data: and blob:, which were the primary motivation for @download. That said, there are lots of cases in which someone can upload files but not pick the filename on the server or control the headers... I'm not so sure about that, but I'll leave it to someone else to argue. (If you determine a file to be a PNG, then you suggest a .png extension, regardless of whether there might be an embedded executable; if you don't support the file format, then how do you know that it isn't supposed to be an executable in the first place? —and what is it doing on the Web?) I assume that last question is a joke, yes? ;) -Boris
Re: [whatwg] Notifications: reviving Notification objects
On Sun, Mar 31, 2013 at 7:33 AM, Anne van Kesteren ann...@annevk.nl wrote: 2) Define a method on Navigator, getNotifications(), that returns a Future which is resolved with an array of Notification objects. Once the Future is resolved, a task is queued to fire a click event on the appropriate Notification object in case of B) and C). I guess if we're getting system messages (as suggested on public-script-coord), that could also be used here and less likely to create timing issues. There are some further gotchas here. One is how Notifications objects should be scoped (and thus what exactly getNotifications()'s Future is resolved with). Origin-scoped would be extremely convenient, but it seems we still have efforts that support further scoping for applications based on the URL path. I think we should go with origin-scoped until documents get some way to associate themselves with an app concept. Maybe the manifest idea that is floating around? The other is how much we need to expose on the Navigation object itself so sites can identify it after reviving it. Currently almost nothing is exposed, but maybe we should simply expose dir/lang/title/body/tag/icon on it. Opinions? -- http://annevankesteren.nl/
Re: [whatwg] API to delay the document load event (continued)
On Mon, May 6, 2013 at 2:17 PM, David Bruant bruan...@gmail.com wrote: Le 06/05/2013 21:35, James Burke a écrit : Just going on my experience (admittedly a limited data set): anything that actually binds to document load really wants to know when all resources loaded (images/iframes) and page is considered complete, which fits with the motivations of this new capability. An app could be considered complete before the UA load event (hidden iframe hasn't finished loading, below-the-fold images haven't fully loaded yet, etc.) Delaying the load event doesn't take that into account. That is fine. If they wanted to signal complete for those purposes, they could delay that other work until the load event fires. If the concern is about an async script erroring out between the paired calls/addition and removal of an attribute, then perhaps any uncaught error ends up triggering the same behavior that occurs now when there is an error during onload determination. In case a component fails to notice it's ready, having the app readiness event separated from the UA load event would allow outsiders to use the UA load as fallback (which is the current best approximation). load could fail with a long requested image too. I don't see this as a strong argument for a separate event. I'm not opposed to a different event, but I also do not feel like there have been strong cases that point to it being a separate event either. By using load, I expect most code that is outside the app would not need to change, as the observers of this state are likely already using 'load', and it fits in with the definition of 'load'. It is just that the platform cannot expect to detect the state on its own now give async startup approaches and JS-generated HTML. If there is really a case that falls down by delaying the load event it would be good to know. That would point strongly to using a different event. James
Re: [whatwg] API to delay the document load event (continued)
On Tue, May 7, 2013 at 3:12 PM, Bjoern Hoehrmann derhoe...@gmx.net wrote: * James Burke wrote: I just joined the mailing list, so I apologize for not continuing the existing thread started here: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2013-April/039422.html Disclaimer: I submitted the Mozilla Bugzilla ticket for some kind of capability in this area. Summarizing previous discussion points: I think it would be helpful if you could phrase these in terms of what various implementations should do. For instance, Google shows screen- shots in search results. How should their take a snapshot bot work? I maintain IECapt and CutyCapt; how should they be changed to support the feature being proposed here? Same question for the Firefox and Firefox- OS features that motivate creating a new feature here. If this feature uses the load' event, as long as those tools listen to load, that is enough: when they receive that event, the page should be rendered in the state the web site developer wanted. If a use case is identified that makes it difficult to use the existing load event, then the tools would need to listen to a new event. They would also need to inspect the document state (maybe by checking for a loading attribute on the documentElement?) to know if this new event is in play. Similarily, it would be helpful to approach the problem from the per- spective of content creators. Let's say you have a website, and any new visitor gets to see an overlay that encourages them to sign up with Acme. When would this site signal that it is ready for the purposes I've mentioned? And would all of the implementations cited above wait for this signal? If the load event approach was used, the web site would: * call document.delayLoadEvent() during JS execution (needs to happen before the browser would normally trigger the load event). * Once the Acme overlay DOM element was inserted, call document.stopDelayingLoadEvent() * The platform waits for any images/resources for the current DOM to finish loading. It then fires the normal document load event. For tools that are listening: * Just listen for the load event for the document. --- If a separate event/trigger instead of load is used, then I am not exactly sure what is needed. One guess: For the web site: * stamp the DOM in some way to indicate this new event would be fired, perhaps add the loading attribute to the documentElement. * insert the Acme overlay DOM element, then remove the attribute. * The platform waits for any images/resources for the current DOM to finish loading (?) then fire appload event? For tools that are listening: * Inspect the state of the page, looking to see if there is a loading attribute in play. Listen for both load and appload, and if the loading attribute was detected, wait for the appload event. In fact, the tool needs to discount load if the page detects appload is in play as it may fire. (So, in response to one of David's comments, trying to fallback to load would not be feasible/detectable). James
[whatwg] HTML5 is broken: menuitem causes infinite loop
Hilarious spec bug of the week: HTML5 requires implementations to loop indefinitely if they see a menuitem start tag. 12.2.5.4.7 in body insertion mode = see a menuitem start tag, process using rules for in head 12.2.5.4.4 in head insertion mode = see menuitem, act as if /head and reprocess 12.2.5.4.6 after head insertion mode = see menuitem, act as if body and reprocess ...and we're back at in body insertion mode, and will continue to bounce around with the menuitem start tag token making absolutely no progress whatsoever. What is the menuitem tag supposed to be, anyway? A test to ensure that implementers are awake, like the /sarcasm close tag? Cheers, Michael