Re: Updates to File API
On 6/23/10 9:50 AM, Jian Li wrote: I think encoding the security origin in the URL allows the UAs to do the security origin check in place, without routing through other authority to get the origin information that might cause the check taking long time to finish. If we worry about showing the double schemes in the URL, we can transform the origin encoded in the URL by using base64 or other escaping algorithm. Jian: the current URL scheme: http://dev.w3.org/2006/webapi/FileAPI/#url allows you to do that, without obliging other UAs to do that. Some UAs may elect to use smart caching to accomplish the same kinds of things, without tagging the URL with origin information. Others may see benefit in origin-tagging. I've reconsidered trying to architect a scheme that allows all use-case scenarios for blob: URIs. -- A* Jian On Wed, Jun 23, 2010 at 8:24 AM, David Levin le...@google.com mailto:le...@google.com wrote: On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.com mailto:adria...@microsoft.com wrote: On Tuesday, June 22, 2010 8:40 PM, David Levin wrote: I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. Robin outlined why that would be a problem [1]. My original feeling was that this should be left up to UAs, as you say, but I've been convinced that doing so is a race to the most complex URL scheme. Robin discussed something that could possibly in http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At the same time, there are implementors who gave specific reasons why encoding certain information (scheme, host, port) in the namespace specific string (NSS) is useful to various UAs. No other information has been requested, so theories adding more information seem premature. If the format must be specified, it seems reasonable to take both the theoretical and practical issues into account. Encoding that the security origin in the NSS isn't complex. If a proposal is needed about how that can be done in a simple way, I'm willing to supply one. Also, UAs that don't care about that information are free to ignore it and don't need to parse it. dave
Re: Updates to File API
On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.comwrote: On Tuesday, June 22, 2010 8:40 PM, David Levin wrote: I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. Robin outlined why that would be a problem [1]. My original feeling was that this should be left up to UAs, as you say, but I've been convinced that doing so is a race to the most complex URL scheme. Robin discussed something that could possibly in http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At the same time, there are implementors who gave specific reasons why encoding certain information (scheme, host, port) in the namespace specific string (NSS) is useful to various UAs. No other information has been requested, so theories adding more information seem premature. If the format must be specified, it seems reasonable to take both the theoretical and practical issues into account. Encoding that the security origin in the NSS isn't complex. If a proposal is needed about how that can be done in a simple way, I'm willing to supply one. Also, UAs that don't care about that information are free to ignore it and don't need to parse it. dave
Re: Updates to File API
I think encoding the security origin in the URL allows the UAs to do the security origin check in place, without routing through other authority to get the origin information that might cause the check taking long time to finish. If we worry about showing the double schemes in the URL, we can transform the origin encoded in the URL by using base64 or other escaping algorithm. Jian On Wed, Jun 23, 2010 at 8:24 AM, David Levin le...@google.com wrote: On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.comwrote: On Tuesday, June 22, 2010 8:40 PM, David Levin wrote: I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. Robin outlined why that would be a problem [1]. My original feeling was that this should be left up to UAs, as you say, but I've been convinced that doing so is a race to the most complex URL scheme. Robin discussed something that could possibly in http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At the same time, there are implementors who gave specific reasons why encoding certain information (scheme, host, port) in the namespace specific string (NSS) is useful to various UAs. No other information has been requested, so theories adding more information seem premature. If the format must be specified, it seems reasonable to take both the theoretical and practical issues into account. Encoding that the security origin in the NSS isn't complex. If a proposal is needed about how that can be done in a simple way, I'm willing to supply one. Also, UAs that don't care about that information are free to ignore it and don't need to parse it. dave
RE: Updates to File API
On Friday, June 11, 2010 11:18 AM, Jonas Sicking wrote: On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? The one advantage I can see is that putting the scheme into the URL allows the *implementation* to deduce the origin by simply looking at the URL-scheme. This avoids having to do a (potentially cross-process) lookup to get the origin. This could be useful for APIs which have to synchronously determine the origin of a given URL in order to throw an exception on an attempted cross-origin access. For example an XMLHttpRequest Level 1 implementation needs to synchronously determine if it should make a call to .open(...) throw or not based on the origin of the passed in URL. However I'm not sure if this is a problem in practice or not. It's entierly possible that the web platform is littered with situations where you need to do synchronous communication with whichever thread the networking code runs on. Firefox is still in the process of going multi-process, so I'll defer to other browsers with more experience in this area. Oh, and I should add that the implementation will of course still have to check once a url is loaded that the origin in the url matches the origin in whatever map is used to map urls to resources. I.e. if the implementation has handed out a url like: filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 and script changes that to: filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 then attempting to load the latter url should result in a 404 or similar. Since the origin requires scheme as well as hostname/port it seems like we'll end up with some encoding or parsing complexity by following this approach. Robin gave good reasons for not allowing user agents to encode data into the URL and I'm not convinced that including origin for this particular case isn't a premature optimisation. At what point will we find other data that's convenient to have encoded in the URL? I think it makes more sense for the URL to be opaque and let user agents figure out the optimal way of implementing origin and other checks. Cheers, Adrian.
Re: Updates to File API
On 6/22/10 8:44 AM, Adrian Bateman wrote: On Friday, June 11, 2010 11:18 AM, Jonas Sicking wrote: On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sickingjo...@sicking.cc wrote: On Fri, Jun 11, 2010 at 9:09 AM, Adrian Batemanadria...@microsoft.com It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? The one advantage I can see is that putting the scheme into the URL allows the *implementation* to deduce the origin by simply looking at the URL-scheme. This avoids having to do a (potentially cross-process) lookup to get the origin. This could be useful for APIs which have to synchronously determine the origin of a given URL in order to throw an exception on an attempted cross-origin access. For example an XMLHttpRequest Level 1 implementation needs to synchronously determine if it should make a call to .open(...) throw or not based on the origin of the passed in URL. However I'm not sure if this is a problem in practice or not. It's entierly possible that the web platform is littered with situations where you need to do synchronous communication with whichever thread the networking code runs on. Firefox is still in the process of going multi-process, so I'll defer to other browsers with more experience in this area. Oh, and I should add that the implementation will of course still have to check once a url is loaded that the origin in the url matches the origin in whatever map is used to map urls to resources. I.e. if the implementation has handed out a url like: filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 and script changes that to: filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 then attempting to load the latter url should result in a 404 or similar. Since the origin requires scheme as well as hostname/port it seems like we'll end up with some encoding or parsing complexity by following this approach. Upon reflection, I agree with Adrian. Origin requires: 1. Scheme 2. Hostname 3. Port 4. Certificates, if any This creates untenable complexity. Robin gave good reasons for not allowing user agents to encode data into the URL and I'm not convinced that including origin for this particular case isn't a premature optimisation. At what point will we find other data that's convenient to have encoded in the URL? +1. I think it makes more sense for the URL to be opaque and let user agents figure out the optimal way of implementing origin and other checks. I think it may be important to define: * Format. I agree that this could be something simple, but it should be defined. By opaque, do you mean undefined? * Behavior with GET. For this, I propose using a subset of HTTP/1.1 responses. -- A*
RE: Updates to File API
On Tuesday, June 22, 2010 3:37 PM, Arun Ranganathan wrote: On 6/22/10 8:44 AM, Adrian Bateman wrote: I think it makes more sense for the URL to be opaque and let user agents figure out the optimal way of implementing origin and other checks. I think it may be important to define: * Format. I agree that this could be something simple, but it should be defined. By opaque, do you mean undefined? * Behavior with GET. For this, I propose using a subset of HTTP/1.1 responses. I think we agree. I actually meant well-defined but opaque to JavaScript consumers. In other words script in a web page can't deduce any meaningful information from the string. If we're aiming for that property then it makes sense that the entire scheme be defined (something like filedata:----000). We can bikeshed the scheme name later but I'd prefer something more generic now url is off Blob. I agree that there should be HTTP/1.1 response codes for GET. Cheers, Adrian.
Re: Updates to File API
On Tue, Jun 22, 2010 at 7:58 PM, Adrian Bateman adria...@microsoft.comwrote: On Tuesday, June 22, 2010 3:37 PM, Arun Ranganathan wrote: On 6/22/10 8:44 AM, Adrian Bateman wrote: I think it makes more sense for the URL to be opaque and let user agents figure out the optimal way of implementing origin and other checks. I think it may be important to define: * Format. I agree that this could be something simple, but it should be defined. By opaque, do you mean undefined? * Behavior with GET. For this, I propose using a subset of HTTP/1.1 responses. I think we agree. I actually meant well-defined but opaque to JavaScript consumers. In other words script in a web page can't deduce any meaningful information from the string. If we're aiming for that property then it makes sense that the entire scheme be defined (something like filedata:----000). I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. dave
RE: Updates to File API
On Tuesday, June 22, 2010 8:40 PM, David Levin wrote: I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. Robin outlined why that would be a problem [1]. My original feeling was that this should be left up to UAs, as you say, but I've been convinced that doing so is a race to the most complex URL scheme. Cheers, Adrian. [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html
Re: Updates to File API
On Sun, Jun 13, 2010 at 10:46 PM, Mark Seaborn mseab...@chromium.org wrote: On Wed, Jun 2, 2010 at 5:06 PM, Jian Li jia...@chromium.org wrote: I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata:http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. Why do the filedata: URLs need to apply a same-origin check? It seems like this would unnecessarily reduce composability. In practice, the URLs returned by the File API would be unguessable anyway. Why not use unguessability of these tokens as the security mechanism? So if a web app wants to share the file with other, co-operating entities (e.g. in an iframe or another tab), it can do so by sharing the URL; otherwise, it can withhold the URL. When would the currently-proposed same-origin checks apply? Would I be right in thinking that they only apply to XMLHttpRequests from Javascript, and don't apply if the URL is linked from an img element? URLs weren't always designed to be security sensitive. One common way they leak is through the referer (sic) header. So if the File whose .url you were loading was an HTML file, and you loaded it using an iframe, you could very easily leak the URL to untrusted parties. At the very least the same-origin check applies such that if a cross-origin file uri is used on img, this would be considered a cross-origin load if that img was later pasted into a canvas. Similarly, if a cross-origin video is loaded, I think some events are with-held, though I'm less sure about that. However, in firefox we've taken a more strict approach. We disallow cross-origin img loads for filedata URIs. I.e. if site A receives a File with a url. Then even if site B manages to get hold of that url, it can't use img to load it. This definitely needs to be specified in spec though. / Jonas
Re: Updates to File API
On Mon, Jun 14, 2010 at 11:35 AM, Mark Seaborn mseab...@chromium.org wrote: On Mon, Jun 14, 2010 at 12:40 AM, Jonas Sicking jo...@sicking.cc wrote: On Sun, Jun 13, 2010 at 10:46 PM, Mark Seaborn mseab...@chromium.org wrote: Why do the filedata: URLs need to apply a same-origin check? It seems like this would unnecessarily reduce composability. In practice, the URLs returned by the File API would be unguessable anyway. Why not use unguessability of these tokens as the security mechanism? So if a web app wants to share the file with other, co-operating entities (e.g. in an iframe or another tab), it can do so by sharing the URL; otherwise, it can withhold the URL. When would the currently-proposed same-origin checks apply? Would I be right in thinking that they only apply to XMLHttpRequests from Javascript, and don't apply if the URL is linked from an img element? URLs weren't always designed to be security sensitive. One common way they leak is through the referer (sic) header. So if the File whose .url you were loading was an HTML file, and you loaded it using an iframe, you could very easily leak the URL to untrusted parties. That's true for http: URLs, but AFAIK it's not true for https: URLs. The browser is not supposed to disclose HTTPS URLs via the Referer header, and I know of at least one app (Tahoe-LAFS) that relies on that. Since the File API is creating the new filedata: URL scheme, it can specify that it has the same property. I'd still be extremely worried that it's much too easy to leak the URLs. Additionally, it's always risky to pass a filedata url to another page as the lifetime of the filedata url is bound to the document that made the call to File.url. Instead it's better to pass the File object around, through for example postMessage, and let every page that needs it request the url. At the very least the same-origin check applies such that if a cross-origin file uri is used on img, this would be considered a cross-origin load if that img was later pasted into a canvas. Similarly, if a cross-origin video is loaded, I think some events are with-held, though I'm less sure about that. However, in firefox we've taken a more strict approach. We disallow cross-origin img loads for filedata URIs. I.e. if site A receives a File with a url. Then even if site B manages to get hold of that url, it can't use img to load it. This is adding a new mechanism, isn't it, since img was previously considered to be normal linking and did not have a same-origin check. What would happen if a page has an a href=filedata:... link? That's a good question. I think the result of our implementation is that the navigation is prevented, similar to how firefox prevents navigating to a link like a href=file://some/local/fs/path. I.e. we don't change the DOM, however if the user clicks the link nothing happens. / Jonas
Re: Updates to File API
On Fri, Jun 11, 2010 at 10:04 PM, Michael Nordman micha...@google.com wrote: Another advantage is that... blobdata://http_responsible_party.org:80/3699b4a0-e43e-4cec-b87b-82b6f83dd752 ... makes it clear to the end user who the responsible party is when these urls are visible in the user interface. (location bar, tooltips, etc). It doesn't, it just means yet another way for scripts to confuse the user. Every time we provide a string whose domain is in control of a domain, the set of evil uses increases as evil groups set up more interesting domains and trick users for another two or three years. With browsers targeting smaller devices, as well as users who are less familiar with the web, or even experienced users who missed memos about IDN, these improvements just cause more problems. Tab: I'd like to specifically call you out for your inclusion of: http://www.詹姆斯.com/blog/2010/06/html5-atom-gone-wrong, a comparison in a recent email. .COM does not allow IDN and you should not have used that. I know someone was being cute, but that doesn't justify confusing users. I don't have time to construct a similarly written domain which happens to go to my own spoof, nor am I going to invest the ~9 USD that it would cost to do so, but it is perfectly reasonable for someone else to do so. The time it would take is probably around 10mins including picking a similar character, registering the domain, and posting content. It's true that this spoof would not fool all of the people all of the time, but it would probably fool most of the people most of the time.
Re: Updates to File API
On Wed, Jun 2, 2010 at 5:06 PM, Jian Li jia...@chromium.org wrote: I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. Why do the filedata: URLs need to apply a same-origin check? It seems like this would unnecessarily reduce composability. In practice, the URLs returned by the File API would be unguessable anyway. Why not use unguessability of these tokens as the security mechanism? So if a web app wants to share the file with other, co-operating entities (e.g. in an iframe or another tab), it can do so by sharing the URL; otherwise, it can withhold the URL. When would the currently-proposed same-origin checks apply? Would I be right in thinking that they only apply to XMLHttpRequests from Javascript, and don't apply if the URL is linked from an img element? Regards, Mark
RE: Updates to File API
On Wednesday, June 02, 2010 5:27 PM, Arun Ranganathan wrote: On 6/2/10 5:06 PM, Jian Li wrote: Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Actually, I'm against leaving it totally up to implementations. Sure, the spec. could simply state how the URL behaves without mentioning format much, but we identified in the past [1] that it was wise to specify things reliably, so that developers didn't rely on arbitrary behavior in one implementation and expect something similar in another. It's precisely that genre of underspecified behavior that got us in trouble before ;-) -- A* [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html Do you think the URL scheme should be specified for each use of Blob or more broadly? For example, Blob is used in the File Reader API but also possibly in the Capture API in a different way. It might be useful to be able to use a different scheme for these different purposes to help the user agent route requests to the appropriate handler. Adrian.
RE: Updates to File API
On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? Cheers, Adrian
Re: Updates to File API
One benefit of using the encoded origin is to do the security origin check in place, instead of resorting to a centralized authority, esp. under multi-process architecture. Considering getting and checking the origin before hitting the cache for the blob.url item. On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.comwrote: On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? Cheers, Adrian
Re: Updates to File API
On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com wrote: On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? The one advantage I can see is that putting the scheme into the URL allows the *implementation* to deduce the origin by simply looking at the URL-scheme. This avoids having to do a (potentially cross-process) lookup to get the origin. This could be useful for APIs which have to synchronously determine the origin of a given URL in order to throw an exception on an attempted cross-origin access. For example an XMLHttpRequest Level 1 implementation needs to synchronously determine if it should make a call to .open(...) throw or not based on the origin of the passed in URL. However I'm not sure if this is a problem in practice or not. It's entierly possible that the web platform is littered with situations where you need to do synchronous communication with whichever thread the networking code runs on. Firefox is still in the process of going multi-process, so I'll defer to other browsers with more experience in this area. / Jonas
Re: Updates to File API
On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com wrote: On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? The one advantage I can see is that putting the scheme into the URL allows the *implementation* to deduce the origin by simply looking at the URL-scheme. This avoids having to do a (potentially cross-process) lookup to get the origin. This could be useful for APIs which have to synchronously determine the origin of a given URL in order to throw an exception on an attempted cross-origin access. For example an XMLHttpRequest Level 1 implementation needs to synchronously determine if it should make a call to .open(...) throw or not based on the origin of the passed in URL. However I'm not sure if this is a problem in practice or not. It's entierly possible that the web platform is littered with situations where you need to do synchronous communication with whichever thread the networking code runs on. Firefox is still in the process of going multi-process, so I'll defer to other browsers with more experience in this area. Oh, and I should add that the implementation will of course still have to check once a url is loaded that the origin in the url matches the origin in whatever map is used to map urls to resources. I.e. if the implementation has handed out a url like: filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 and script changes that to: filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752 then attempting to load the latter url should result in a 404 or similar. / Jonas
Re: Updates to File API
Another advantage is that... blobdata:// http_responsible_party.org:80/3699b4a0-e43e-4cec-b87b-82b6f83dd752 ... makes it clear to the end user who the responsible party is when these urls are visible in the user interface. (location bar, tooltips, etc). On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com wrote: On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? The one advantage I can see is that putting the scheme into the URL allows the *implementation* to deduce the origin by simply looking at the URL-scheme. This avoids having to do a (potentially cross-process) lookup to get the origin. This could be useful for APIs which have to synchronously determine the origin of a given URL in order to throw an exception on an attempted cross-origin access. For example an XMLHttpRequest Level 1 implementation needs to synchronously determine if it should make a call to .open(...) throw or not based on the origin of the passed in URL. However I'm not sure if this is a problem in practice or not. It's entierly possible that the web platform is littered with situations where you need to do synchronous communication with whichever thread the networking code runs on. Firefox is still in the process of going multi-process, so I'll defer to other browsers with more experience in this area. / Jonas
Re: Updates to File API
Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? Eric On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. 2. url and type properties have been moved to to the underlying Blob interface. Notably, the property is now called 'url' and not 'urn.' Use cases for triggering 'save as' behavior with Content-Disposition have not been addressed[2], although I believe that with FileWriter and BlobBuilder[3] they may be addressed differently. This change reflects lengthy discussion (e.g. start here[4]) 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I solicited implementer feedback about URLs vs. URNs in general. There was a general preference to URLs[5], though this wasn't a strong preference. Moreover, Mozilla's implementation currently uses moz-filedata: . The current draft has an editor's note about the use of HTTP semantics, and origin issues in the context of shared workers. This is work in progress; I have removed the section specifying urn:uuid and hope to have an update with a section covering the filedata: scheme (with filedata:uuid as a suggestion). I welcome discussion about this. I'll point out that we are coining a new scheme, which we originally sought to avoid :-) 4. I have changed event order; loadend now fires after an error event [6]. -- A* [1] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html [2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html [3] http://dev.w3.org/2009/dap/file-system/file-writer.html [4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html [5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html [6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html
Re: Updates to File API
On 6/2/10 3:42 PM, Eric Uhrhane wrote: Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? That's intentional; readAsDataURL was cited as useful only in the context of File objects. Do you think it makes sense in the context of random Blob objects? Does it make sense on slice calls on a Blob, for example? -- A*
Re: Updates to File API
On Wed, Jun 2, 2010 at 3:44 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 3:42 PM, Eric Uhrhane wrote: Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? That's intentional; readAsDataURL was cited as useful only in the context of File objects. Do you think it makes sense in the context of random Blob objects? Does it make sense on slice calls on a Blob, for example? Sure, why not? Why would this be limited to File objects? A File is supposed to refer to an actual file on the local hard drive. A Blob is a big bunch of data that you might want to do something with. There's nothing special about a File when it comes to what you're doing with the data. Just as we moved File.url up to Blob, I think File.readAsDataURL belongs there too.
Re: Updates to File API
On 6/2/10 3:48 PM, Eric Uhrhane wrote: Sure, why not? Why would this be limited to File objects? A File is supposed to refer to an actual file on the local hard drive. A Blob is a big bunch of data that you might want to do something with. There's nothing special about a File when it comes to what you're doing with the data. Just as we moved File.url up to Blob, I think File.readAsDataURL belongs there too. Fair enough; I'm amenable to moving it. So specifically, you're okay with a DataURL on a Blob? It might not be anything useful; with a File, you at least have the possibility of a whole unsliced image file. Could you give me a use case where this is really useful for Blob objects? Also, above you probably mean specifying that readAsDataURL (a method on FileReader) works on Blob objects, not File.readAsDataURL ;-) -- A*
Re: Updates to File API
On Wed, Jun 2, 2010 at 3:48 PM, Eric Uhrhane er...@google.com wrote: On Wed, Jun 2, 2010 at 3:44 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 3:42 PM, Eric Uhrhane wrote: Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? That's intentional; readAsDataURL was cited as useful only in the context of File objects. Do you think it makes sense in the context of random Blob objects? Does it make sense on slice calls on a Blob, for example? Sure, why not? Why would this be limited to File objects? A File is supposed to refer to an actual file on the local hard drive. A Blob is a big bunch of data that you might want to do something with. There's nothing special about a File when it comes to what you're doing with the data. Just as we moved File.url up to Blob, I think File.readAsDataURL belongs there too. And we move type from File to Blob.
Re: Updates to File API
On Wed, Jun 2, 2010 at 3:57 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 3:48 PM, Eric Uhrhane wrote: Sure, why not? Why would this be limited to File objects? A File is supposed to refer to an actual file on the local hard drive. A Blob is a big bunch of data that you might want to do something with. There's nothing special about a File when it comes to what you're doing with the data. Just as we moved File.url up to Blob, I think File.readAsDataURL belongs there too. Fair enough; I'm amenable to moving it. So specifically, you're okay with a DataURL on a Blob? It might not be anything useful; with a File, you at least have the possibility of a whole unsliced image file. Could you give me a use case where this is really useful for Blob objects? One that's come up for Blob.url is a packed file of image thumbnails: you can do one big download, then slice and display the pieces. If you're doing any display by data URLs, that would work there too. To be honest, I think a lot of the data URL use cases are better served by Blob.url anyway, so I'm not sure how many will remain once this spec is fully implemented, but can you think of a data URL use case that really depends on the data coming from a File on disk instead of a Blob? Also, above you probably mean specifying that readAsDataURL (a method on FileReader) works on Blob objects, not File.readAsDataURL ;-) Yeah. Brain-o.
Re: Updates to File API
Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Jian On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. 2. url and type properties have been moved to to the underlying Blob interface. Notably, the property is now called 'url' and not 'urn.' Use cases for triggering 'save as' behavior with Content-Disposition have not been addressed[2], although I believe that with FileWriter and BlobBuilder[3] they may be addressed differently. This change reflects lengthy discussion (e.g. start here[4]) 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I solicited implementer feedback about URLs vs. URNs in general. There was a general preference to URLs[5], though this wasn't a strong preference. Moreover, Mozilla's implementation currently uses moz-filedata: . The current draft has an editor's note about the use of HTTP semantics, and origin issues in the context of shared workers. This is work in progress; I have removed the section specifying urn:uuid and hope to have an update with a section covering the filedata: scheme (with filedata:uuid as a suggestion). I welcome discussion about this. I'll point out that we are coining a new scheme, which we originally sought to avoid :-) 4. I have changed event order; loadend now fires after an error event [6]. -- A* [1] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html [2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html [3] http://dev.w3.org/2009/dap/file-system/file-writer.html [4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html [5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html [6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html
Re: Updates to File API
On Wed, Jun 2, 2010 at 3:42 PM, Eric Uhrhane er...@google.com wrote: Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? Having readAsDataURL take a File made sense when .url and .type lived on File rather than Blob. Now that Blobs have .types I agree that readAsDataURL should be able to read from a Blob. But, as you say, I think .url solves most of the use cases. One usecase I can still think of is a web based HTML editor which allows inserting images. These images could be included inline in the main document using data-urls which allows the document to be saved/sent as a single document, rather than HTML + a pile of images. / Jonas
Re: Updates to File API
On 6/2/10 5:06 PM, Jian Li wrote: Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Actually, I'm against leaving it totally up to implementations. Sure, the spec. could simply state how the URL behaves without mentioning format much, but we identified in the past [1] that it was wise to specify things reliably, so that developers didn't rely on arbitrary behavior in one implementation and expect something similar in another. It's precisely that genre of underspecified behavior that got us in trouble before ;-) -- A* [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html
Re: Updates to File API
I got what you mean. Thanks for clarifying it. Do you plan to add the origin encoding into the spec? How about using more generic scheme name blobdata:? Jian On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Actually, I'm against leaving it totally up to implementations. Sure, the spec. could simply state how the URL behaves without mentioning format much, but we identified in the past [1] that it was wise to specify things reliably, so that developers didn't rely on arbitrary behavior in one implementation and expect something similar in another. It's precisely that genre of underspecified behavior that got us in trouble before ;-) -- A* [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html
Re: Updates to File API
On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. / Jonas
Re: Updates to File API
On May 21, 2010, at 00:41 , Jonas Sicking wrote: On Thu, May 20, 2010 at 2:53 PM, Nathan nat...@webr3.org wrote: If the scope of the identifiers is limited to a single ua, on a single machine, and specific to that single ua (as in I can't expect to request the identifier outside of the ua that provided it on x machine and get the same results) then I (personally) can't see why there's a need for anything more than a simple unique identifier (sha1 or suchlike) Note that the important point of these URNs isn't that they are identifiers, but rather that you can point a iframe.src, or a img.src, or a #myElement { background-url: url(...) } at them. Right, and to further Jonas's explanation, imagine .url (or .id, or whatever) returned a simple identifier, say some opaque hex string of sorts like DEADBEEF. Now you want to get that image file and assign it as the source of an img (which is the whole point): img.src = file.url; If your document is at http://deadbff.org/foo/ you've essentially made your image element link to http://deadbff.org/foo/DEADBEEF. That's not what you wanted. Using a syntax (be it URI scheme or URN) that can naturally disambiguate between relative URI references and these magic references is, alas, needed. -- Robin Berjon - http://berjon.com/
Re: Updates to File API
Jonas Sicking wrote: On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com wrote: 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I'm not sure that one follows from the other. The property's called 'url' because that's what will be familiar to authors, but the magic string that goes inside of it could still be a URN. FWIW, I'm a developer and sticking a URN in a .url property really doesn't seem familiar at all - even a '.id' property with an id that was consistently generated would be much better. If the scope of the identifiers is limited to a single ua, on a single machine, and specific to that single ua (as in I can't expect to request the identifier outside of the ua that provided it on x machine and get the same results) then I (personally) can't see why there's a need for anything more than a simple unique identifier (sha1 or suchlike) And if the above is true, then surely this would negate the need for .url, registering a new URI scheme, or URN namespace - and all in save you all from lots of headaches time wasted, close the issue, and save the developer community from years of further confusion (or should i say conflated understanding of what a URL is), and benefit the entire web by saving us from yet another (predominantly unneeded) URN namespace or URL scheme. Best leave this in your capable hands. Nathan
Re: Updates to File API
On Thu, May 20, 2010 at 2:53 PM, Nathan nat...@webr3.org wrote: Jonas Sicking wrote: On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com wrote: 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I'm not sure that one follows from the other. The property's called 'url' because that's what will be familiar to authors, but the magic string that goes inside of it could still be a URN. FWIW, I'm a developer and sticking a URN in a .url property really doesn't seem familiar at all - even a '.id' property with an id that was consistently generated would be much better. If the scope of the identifiers is limited to a single ua, on a single machine, and specific to that single ua (as in I can't expect to request the identifier outside of the ua that provided it on x machine and get the same results) then I (personally) can't see why there's a need for anything more than a simple unique identifier (sha1 or suchlike) And if the above is true, then surely this would negate the need for .url, registering a new URI scheme, or URN namespace - and all in save you all from lots of headaches time wasted, close the issue, and save the developer community from years of further confusion (or should i say conflated understanding of what a URL is), and benefit the entire web by saving us from yet another (predominantly unneeded) URN namespace or URL scheme. Note that the important point of these URNs isn't that they are identifiers, but rather that you can point a iframe.src, or a img.src, or a #myElement { background-url: url(...) } at them. In all useful use cases brought up so far, the website author will never look at the actual string to see what it contains, but rather just treat it as a url and load data from it. The intended use for it is things like: img id=preview input type=file onchange=document.getElementById('preview').src = this.files[0].url In this context, calling the string an identifier misses the point IMHO. (btw, the above example should work fine in nightly firefox builds) / Jonas
Re: Updates to File API
On Tue, May 18, 2010 at 2:56 PM, Arun Ranganathan a...@mozilla.com wrote: On 5/18/10 2:35 PM, Eric Uhrhane wrote: On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org wrote: I have couple of questions, mostly clarifications I think: 1. FileReader takes Blob but there are multiple hints that the blob should be actually a 'file'. As we see Blob concept grows in popularity with such specs as FileWriter that defines BlobBuilder. Other proposals include Image Resizing that returns a Blob with a compressed image data. Can all types of Blobs be 'read' using FileReader? If not, then it would be logical to only accept File parameter. If any type of Blob can be read (as I think the spirit of the spec assumes) then would it be less confusing to cange the name to BlobReader? I'd support that. I think we always want to allow reading of any type of Blob--it's the interchange container--so calling it BlobReader makes sense. Arun, how do you feel about that? The FileReader object accepts File objects for DataURL-reads, and Blob objects for binary string, text, and binary reads. I agree that having a name like FileReader is generally a bit confusing, given that we do allow Blobs to be read, including Blobs which aren't directly coined from files. Blob itself isn't a great name, though it's a stand-in for Binary Large Object. Aside from the slight bikeshed-ish nature of this discussion, there are implementations in circulation that already use the name FileReader (e.g. Firefox 3.6.3). This doesn't mean I'm against changing it, but I do wish the name change suggestion came earlier. Also, I'm keen that the main object name addresses the initial use case -- reading file objects. Perhaps in the future Blobs that are not files will be the norm; maybe then, Blob APIs will evolve, including implementations with ArrayBuffer and potential streaming use cases getting addressed better. Perhaps it is late to have a name change, and we've added to less-than-adequate naming on the Web (example: XMLHttpRequest). It doesn't seem too late to change the name. FF could support both FileReader and BlobReader. One could just be an alias for the other. It seems like we have situations like this frequently when it comes to new web platform APIs. A name only becomes immutable once there is a lot of content using it since user agents would be compelled to support the existing name for compat with existing content ;-) -Darin Would FileWriter ever be used to write anything other than a File? I think not, so it should probably stay as it is, despite the lack of symmetry. 2. The FileReader.result is a string. Actually, in my next draft, I will have FileReader.result be of type 'any' (WebILD's 'any') since it could also be an ArrayBuffer (using the readAsBinary method, which will function like the other asynchrous read methods, but read into ArrayBuffers across the ProgressEvent spectrum. -- A*
Re: Updates to File API
On 19/05/10 08:00, Darin Fisher wrote: It doesn't seem too late to change the name. FF could support both FileReader and BlobReader. One could just be an alias for the other. It seems like we have situations like this frequently when it comes to new web platform APIs. A name only becomes immutable once there is a lot of content using it since user agents would be compelled to support the existing name for compat with existing content ;-) I would agree with this; the name in the specification should be whatever makes most sense, not the name that has been used in a draft implementation. I also think it's perfectly reasonable to expect content using the draft API to be updated over a relatively short period of time (meaning I would not expect Firefox to handle both names for more than a year or two).
Re: Updates to File API
Hi Arun, On May 13, 2010, at 14:27 , Arun Ranganathan wrote: I have updated the editor's draft of the File API to reflect changes that have been in discussion. Cool, thanks! ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1] The TA draft doesn't include any copyright or licensing information. I take it that the plan is to eventually have it at some stable URL accessible to all and under an RF license? 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I'm not sure that one follows from the other. The property's called 'url' because that's what will be familiar to authors, but the magic string that goes inside of it could still be a URN. I solicited implementer feedback about URLs vs. URNs in general. There was a general preference to URLs[5], though this wasn't a strong preference. Moreover, Mozilla's implementation currently uses moz-filedata: . The current draft has an editor's note about the use of HTTP semantics, and origin issues in the context of shared workers. This is work in progress; I have removed the section specifying urn:uuid and hope to have an update with a section covering the filedata: scheme (with filedata:uuid as a suggestion). I welcome discussion about this. I'll point out that we are coining a new scheme, which we originally sought to avoid :-) I don't really have a strong preference, but I believe that registering a URN namespace (in the case where we would go for urn:file-data: instead of urn:uuid:) is easier than registering a URI scheme. Since I have a strong feeling that you'll be the one who'll end up doing that work, you might want to take that into consideration ;-) Implementation-wise I can see how some might have the plumbing in place to dispatch depending on URI schemes but not for URNs. Unless someone has a strong feeling (i.e. not bikeshedding) on this I would suggest closing this issue and leaving it up to the editor. Is using a subset of HTTP response codes acceptable practice, or should we forgo response codes in this specification? That seems to risk getting you close to specifying the behaviour of file: :) The problem of forgoing response codes is that it breaks a number of libraries. For instance (IIRC), the following never calls you back: $.get(file:///foo.html, cb) because jQuery never detects a successful fetch of the file (even though the underlying XHR may have succeeded) — the same would apply to filedata:. A subset of HTTP has the downside that it should ideally be consistent. Maybe that can be done with brutality? Reject any method other than GET with a 405, return 400 for any header the author sets that involves conditional or negotiated responses, 404 if the URI doesn't exist, and 200 if it does. Only set the response headers that match information already exposed on the Blob. Editorial note Issue: if it is determined that the type attribute is one of text/html, text/xml, or application/xml then the specification should allow HTML5 [HTML5] parsing (creation of Document) or XML parsing specified in XML specifications. Should there be normative text for this? I'm not sure I follow the intent exactly here, do you mean adding something like readAsDocument()? That sounds nice (and ought to work for +xml types as well) but is it essential enough? -- Robin Berjon - http://berjon.com/
Re: Updates to File API
Robin, ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1] The TA draft doesn't include any copyright or licensing information. I take it that the plan is to eventually have it at some stable URL accessible to all and under an RF license? Yes! Technical details are currently being hashed out on es-disc...@mozilla.org (the general ECMAScript discussion forum). I expect it to have a more formal home, and of course, an RF license. This is something we should fix in the short term as well, and I'll raise this through the WebGL WG. 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I'm not sure that one follows from the other. The property's called 'url' because that's what will be familiar to authors, but the magic string that goes inside of it could still be a URN. I agree that this is probably workable. (And thanks for commenting on this issue :-) ) I don't really have a strong preference, but I believe that registering a URN namespace (in the case where we would go for urn:file-data: instead of urn:uuid:) is easier than registering a URI scheme. Since I have a strong feeling that you'll be the one who'll end up doing that work, you might want to take that into consideration ;-) If we do go with a URN for the .url property, then I'm not sure what benefit is gained from registering a new URN namespace (since we could use urn:uuid:). One advantage of using urn:uuid was that the new technology overhead was low. At the moment, I'm torn on this, but I'll note that implementations are proceeding with what looks like a new scheme (or at least what could be a new URN namespace). Again, implementor feedback is welcome, but the point you make below is what I think is true for other implementations (but not necessarily Firefox): Implementation-wise I can see how some might have the plumbing in place to dispatch depending on URI schemes but not for URNs. +1 (again, not true of Firefox, where it doesn't really make a difference). Unless someone has a strong feeling (i.e. not bikeshedding) on this I would suggest closing this issue and leaving it up to the editor. Thanks :) Is using a subset of HTTP response codes acceptable practice, or should we forgo response codes in this specification? That seems to risk getting you close to specifying the behaviour of file: :) The problem of forgoing response codes is that it breaks a number of libraries. For instance (IIRC), the following never calls you back: $.get(file:///foo.html, cb) because jQuery never detects a successful fetch of the file (even though the underlying XHR may have succeeded) — the same would apply to filedata:. A subset of HTTP has the downside that it should ideally be consistent. Maybe that can be done with brutality? Reject any method other than GET with a 405, return 400 for any header the author sets that involves conditional or negotiated responses, 404 if the URI doesn't exist, and 200 if it does. Only set the response headers that match information already exposed on the Blob. All good suggestions. I think the subset can be determined by researching what XHR is used for within file:///, which is how I'm currently proceeding. I agree with GET + strict subset of responses. Information set on Blob is likely to only include Content-Type (for now). Editorial note Issue: if it is determined that the type attribute is one of text/html, text/xml, or application/xml then the specification should allow HTML5 [HTML5] parsing (creation of Document) or XML parsing specified in XML specifications. Should there be normative text for this? I'm not sure I follow the intent exactly here, do you mean adding something like readAsDocument()? That sounds nice (and ought to work for +xml types as well) but is it essential enough? User-agents do type determination on files, and if it is discovered that the file in question (or Blob) is an HTML file or an XML file, we should probably follow those rules. I don't think we need a readAsDocument( ), since readAsText, which gives you a string, might be enough. -- A*
Re: Updates to File API
On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com wrote: 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I'm not sure that one follows from the other. The property's called 'url' because that's what will be familiar to authors, but the magic string that goes inside of it could still be a URN. I agree that this is probably workable. (And thanks for commenting on this issue :-) ) I agree with Robin. We should definitely not get into defining things with paths and stuff. I don't have a strong opinion about what the scheme should be, but we definitely want it to be some sort of unique identifier plus a prefix. I don't really have a strong preference, but I believe that registering a URN namespace (in the case where we would go for urn:file-data: instead of urn:uuid:) is easier than registering a URI scheme. Since I have a strong feeling that you'll be the one who'll end up doing that work, you might want to take that into consideration ;-) If we do go with a URN for the .url property, then I'm not sure what benefit is gained from registering a new URN namespace (since we could use urn:uuid:). One advantage of using urn:uuid was that the new technology overhead was low. At the moment, I'm torn on this, but I'll note that implementations are proceeding with what looks like a new scheme (or at least what could be a new URN namespace). Again, implementor feedback is welcome, but the point you make below is what I think is true for other implementations (but not necessarily Firefox): For what it's worth, implementing a new scheme would be easier in firefox too. However I don't care strongly as either solution is still implementable. Implementation-wise I can see how some might have the plumbing in place to dispatch depending on URI schemes but not for URNs. +1 (again, not true of Firefox, where it doesn't really make a difference). See above. Unless someone has a strong feeling (i.e. not bikeshedding) on this I would suggest closing this issue and leaving it up to the editor. I agree. / Jonas
Re: Updates to File API
On Mon, May 17, 2010 at 3:37 PM, Dmitry Titov dim...@chromium.org wrote: I have couple of questions, mostly clarifications I think: 1. FileReader takes Blob but there are multiple hints that the blob should be actually a 'file'. As we see Blob concept grows in popularity with such specs as FileWriter that defines BlobBuilder. Other proposals include Image Resizing that returns a Blob with a compressed image data. Can all types of Blobs be 'read' using FileReader? If not, then it would be logical to only accept File parameter. If any type of Blob can be read (as I think the spirit of the spec assumes) then would it be less confusing to cange the name to BlobReader? I'd support that. I think we always want to allow reading of any type of Blob--it's the interchange container--so calling it BlobReader makes sense. Arun, how do you feel about that? Would FileWriter ever be used to write anything other than a File? I think not, so it should probably stay as it is, despite the lack of symmetry. 2. The FileReader.result is a string. There could be useful cases where it could be useful to read the data as ArrayBuffer. For example, if a page tries to crack the JPG file to extract the EXIF metadata. Maybe returning a Blob that can later be asked for ArrayBuffer would be as good. You're going to give a Blob to a FileReader, and get the same Blob back? Dmitry On Fri, May 14, 2010 at 11:52 AM, Arun Ranganathan a...@mozilla.com wrote: On 5/13/10 9:32 PM, Darin Fisher wrote: Glad to hear that you didn't intend sync access :-) I have thoughts on Blob and how it should behave (and about the inheritance relationship between Blob and File), which is why I left the unfortunate error in the editor's draft for now (commented out and caveated). This is the subject of a separate email thread (but don't worry -- while my thoughts on Blob and ArrayBuffer may be in some flux, sync access to File objects is *always* going to be a no-no, I promise :-) ). Now aside from the Blob - ArrayBuffer relationship, which I introduced, the rest of the changes are in keeping with threads discussing the File API. Can you define the contentType parameter to slice better? Is that intended to correspond to the value of a HTTP Content-Type response header? For example, can the contentType value include a charset attribute? It might be useful to indicate that a slice of a file should be treated as text/html with a specific encoding. I'm happy to define it better in terms of what it *should* be, but web developers are likely to use it in ways that we can't predict, which is why forcing Content-Types is useful, but weird. Why exactly do you mean when you say that a slice of a file should be treated as text/html with a specific encoding? Can you give me a use case that illustrates why this is a good way to define this? I'm also a fan of providing a way to specify optional Content-Disposition parameters in the slice call. So I'm really not a Content-Disposition fan, since all the use cases I've seen so far seem to be to force download behavior (or trigger Download Manager). Is there something I'm missing -- e.g. is there something here that FileWriter or BlobBuilder do *not* address, that putting Content-Disposition on Blob URLs *does* address? Sorry if I'm missing something obvious. -- A*
Re: Updates to File API
On Fri, May 14, 2010 at 11:52 AM, Arun Ranganathan a...@mozilla.com wrote: On 5/13/10 9:32 PM, Darin Fisher wrote: Glad to hear that you didn't intend sync access :-) I have thoughts on Blob and how it should behave (and about the inheritance relationship between Blob and File), which is why I left the unfortunate error in the editor's draft for now (commented out and caveated). This is the subject of a separate email thread (but don't worry -- while my thoughts on Blob and ArrayBuffer may be in some flux, sync access to File objects is *always* going to be a no-no, I promise :-) ). Now aside from the Blob - ArrayBuffer relationship, which I introduced, the rest of the changes are in keeping with threads discussing the File API. Can you define the contentType parameter to slice better? Is that intended to correspond to the value of a HTTP Content-Type response header? For example, can the contentType value include a charset attribute? It might be useful to indicate that a slice of a file should be treated as text/html with a specific encoding. I'm happy to define it better in terms of what it *should* be, but web developers are likely to use it in ways that we can't predict, which is why forcing Content-Types is useful, but weird. Why exactly do you mean when you say that a slice of a file should be treated as text/html with a specific encoding? Can you give me a use case that illustrates why this is a good way to define this? I can't speak for Darin, but I'd think the same reasoning that applies whenever a server adds those headers via HTTP should apply whenever a client-side app wants to add them to a Blob.url. I'm also a fan of providing a way to specify optional Content-Disposition parameters in the slice call. So I'm really not a Content-Disposition fan, since all the use cases I've seen so far seem to be to force download behavior (or trigger Download Manager). Is there something I'm missing -- e.g. is there something here that FileWriter or BlobBuilder do *not* address, that putting Content-Disposition on Blob URLs *does* address? Sorry if I'm missing something obvious. It is indeed generally intended to trigger Download Manager. If you take a look at my use case at [1], the idea is to give web developers a facility that's just like the one they're already using, so that anything they do with URLs for files online they can also do with URLs for Blobs offline/client-side. The FileWriter spec's a bit up in the air over the same issue; I haven't yet specced a good way for FileWriter to solve this problem, so it's hard to say it's going to handle it better. Eric [1] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0412.html
Re: Updates to File API
On 5/18/10 2:35 PM, Eric Uhrhane wrote: On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org wrote: I have couple of questions, mostly clarifications I think: 1. FileReader takes Blob but there are multiple hints that the blob should be actually a 'file'. As we see Blob concept grows in popularity with such specs as FileWriter that defines BlobBuilder. Other proposals include Image Resizing that returns a Blob with a compressed image data. Can all types of Blobs be 'read' using FileReader? If not, then it would be logical to only accept File parameter. If any type of Blob can be read (as I think the spirit of the spec assumes) then would it be less confusing to cange the name to BlobReader? I'd support that. I think we always want to allow reading of any type of Blob--it's the interchange container--so calling it BlobReader makes sense. Arun, how do you feel about that? The FileReader object accepts File objects for DataURL-reads, and Blob objects for binary string, text, and binary reads. I agree that having a name like FileReader is generally a bit confusing, given that we do allow Blobs to be read, including Blobs which aren't directly coined from files. Blob itself isn't a great name, though it's a stand-in for Binary Large Object. Aside from the slight bikeshed-ish nature of this discussion, there are implementations in circulation that already use the name FileReader (e.g. Firefox 3.6.3). This doesn't mean I'm against changing it, but I do wish the name change suggestion came earlier. Also, I'm keen that the main object name addresses the initial use case -- reading file objects. Perhaps in the future Blobs that are not files will be the norm; maybe then, Blob APIs will evolve, including implementations with ArrayBuffer and potential streaming use cases getting addressed better. Perhaps it is late to have a name change, and we've added to less-than-adequate naming on the Web (example: XMLHttpRequest). Would FileWriter ever be used to write anything other than a File? I think not, so it should probably stay as it is, despite the lack of symmetry. 2. The FileReader.result is a string. Actually, in my next draft, I will have FileReader.result be of type 'any' (WebILD's 'any') since it could also be an ArrayBuffer (using the readAsBinary method, which will function like the other asynchrous read methods, but read into ArrayBuffers across the ProgressEvent spectrum. -- A*
Re: Updates to File API
On Tue, May 18, 2010 at 2:56 PM, Arun Ranganathan a...@mozilla.com wrote: On 5/18/10 2:35 PM, Eric Uhrhane wrote: On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org wrote: I have couple of questions, mostly clarifications I think: 1. FileReader takes Blob but there are multiple hints that the blob should be actually a 'file'. As we see Blob concept grows in popularity with such specs as FileWriter that defines BlobBuilder. Other proposals include Image Resizing that returns a Blob with a compressed image data. Can all types of Blobs be 'read' using FileReader? If not, then it would be logical to only accept File parameter. If any type of Blob can be read (as I think the spirit of the spec assumes) then would it be less confusing to cange the name to BlobReader? I'd support that. I think we always want to allow reading of any type of Blob--it's the interchange container--so calling it BlobReader makes sense. Arun, how do you feel about that? The FileReader object accepts File objects for DataURL-reads, and Blob objects for binary string, text, and binary reads. I agree that having a name like FileReader is generally a bit confusing, given that we do allow Blobs to be read, including Blobs which aren't directly coined from files. Blob itself isn't a great name, though it's a stand-in for Binary Large Object. Aside from the slight bikeshed-ish nature of this discussion, there are implementations in circulation that already use the name FileReader (e.g. Firefox 3.6.3). This doesn't mean I'm against changing it, but I do wish the name change suggestion came earlier. Also, I'm keen that the main object name addresses the initial use case -- reading file objects. Perhaps in the future Blobs that are not files will be the norm; maybe then, Blob APIs will evolve, including implementations with ArrayBuffer and potential streaming use cases getting addressed better. Perhaps it is late to have a name change, and we've added to less-than-adequate naming on the Web (example: XMLHttpRequest). Ok, I can see how it can be late if FF already shipped it... Perhaps the spec could at least avoid using 'fileBlob' as names of arguments, since the naming currently may be interpreted as if only file-backed blobs are welcome :-) Would FileWriter ever be used to write anything other than a File? I think not, so it should probably stay as it is, despite the lack of symmetry. 2. The FileReader.result is a string. Actually, in my next draft, I will have FileReader.result be of type 'any' (WebILD's 'any') since it could also be an ArrayBuffer (using the readAsBinary method, which will function like the other asynchrous read methods, but read into ArrayBuffers across the ProgressEvent spectrum. Getting an ArrayBuffer on each ProgressEvent could be a cool idea indeed. I guess when we have ArrayBuffers we'll be able to use them in BlobBuilder as well. -- A*
Re: Updates to File API
On 5/13/10 9:32 PM, Darin Fisher wrote: Glad to hear that you didn't intend sync access :-) I have thoughts on Blob and how it should behave (and about the inheritance relationship between Blob and File), which is why I left the unfortunate error in the editor's draft for now (commented out and caveated). This is the subject of a separate email thread (but don't worry -- while my thoughts on Blob and ArrayBuffer may be in some flux, sync access to File objects is *always* going to be a no-no, I promise :-) ). Now aside from the Blob - ArrayBuffer relationship, which I introduced, the rest of the changes are in keeping with threads discussing the File API. Can you define the contentType parameter to slice better? Is that intended to correspond to the value of a HTTP Content-Type response header? For example, can the contentType value include a charset attribute? It might be useful to indicate that a slice of a file should be treated as text/html with a specific encoding. I'm happy to define it better in terms of what it *should* be, but web developers are likely to use it in ways that we can't predict, which is why forcing Content-Types is useful, but weird. Why exactly do you mean when you say that a slice of a file should be treated as text/html with a specific encoding? Can you give me a use case that illustrates why this is a good way to define this? I'm also a fan of providing a way to specify optional Content-Disposition parameters in the slice call. So I'm really not a Content-Disposition fan, since all the use cases I've seen so far seem to be to force download behavior (or trigger Download Manager). Is there something I'm missing -- e.g. is there something here that FileWriter or BlobBuilder do *not* address, that putting Content-Disposition on Blob URLs *does* address? Sorry if I'm missing something obvious. -- A*
Updates to File API
Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. 2. url and type properties have been moved to to the underlying Blob interface. Notably, the property is now called 'url' and not 'urn.' Use cases for triggering 'save as' behavior with Content-Disposition have not been addressed[2], although I believe that with FileWriter and BlobBuilder[3] they may be addressed differently. This change reflects lengthy discussion (e.g. start here[4]) 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I solicited implementer feedback about URLs vs. URNs in general. There was a general preference to URLs[5], though this wasn't a strong preference. Moreover, Mozilla's implementation currently uses moz-filedata: . The current draft has an editor's note about the use of HTTP semantics, and origin issues in the context of shared workers. This is work in progress; I have removed the section specifying urn:uuid and hope to have an update with a section covering the filedata: scheme (with filedata:uuid as a suggestion). I welcome discussion about this. I'll point out that we are coining a new scheme, which we originally sought to avoid :-) 4. I have changed event order; loadend now fires after an error event [6]. -- A* [1] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html [2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html [3] http://dev.w3.org/2009/dap/file-system/file-writer.html [4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html [5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html [6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html
Re: Updates to File API
On 5/13/10 7:37 AM, David Levin wrote: On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. Does this imply *sync* access to the blob data? new DataArray(blob.blobBuffer).getInt8(0); Sync. access to a Blob shouldn't be allowed; this is a *big* oversight on my part, and I think how the property is exposed should be considered better. Also, does it imply the ability to modify the blob contents? (If so, what does this mean when there is a file backing it?) new DataArray(blob.blobBuffer).setInt8(0, 0); This is part of the same oversight (evident in the editor's draft). I think this aspect of things should be left to BlobBuilder or FileWriter. -- A* Thanks, dave
Re: Updates to File API
On 5/13/10 7:37 AM, David Levin wrote: On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. Does this imply *sync* access to the blob data? new DataArray(blob.blobBuffer).getInt8(0); A more sensible way is an additional asynchronous read method on FileReader, which is what I should have done in the first place. Here, partial data is going to be an interesting question. While partial strings makes sense (for readAsBinaryString and readAsText), partial ArrayBuffers gets us into a different area altogether. Any thoughts on partial reads here? For now, I've caveated my (pretty major) mistake with an editor's note. I'll update later today with a better way to expose this, but I'm thinking something like readAsArrayBuffer on FileReader (with an open question on partial reads). Also, does it imply the ability to modify the blob contents? (If so, what does this mean when there is a file backing it?) new DataArray(blob.blobBuffer).setInt8(0, 0); I'll let Eric speak to what BlobBuilder might want to do, but I'll strongly disallow it in my draft :) -- A* Thanks, dave
Re: Updates to File API
On 13 May 2010, at 13:27, Arun Ranganathan wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. Why remove the 'type' attribute from the File? Specifically, is there a real issue with duplicating the information in both the File and the Blob? Two main concerns: Without 'type' in the File attribute, you have to read the file to understand what's in it. This means that if you want to, for example, produce a confirmation dialogue for the user before reading a file, it's very limited in how much information it can show (I also think it would be a good idea to have 'size' as an attribute on the File, for related reasons). At the moment, if a directory is been dragged and dropped into Firefox, the only way to spot this appears to be the 'type' attribute (which is empty). Unless I'm missing something, as written this would appear to mean the JS has to try reading a directory, get a Blob back (which Firefox does do, at least) and then it can find out it didn't actually read a file. Looking at synchronized file reading... would it perhaps make more sense to have readAsBinaryString(), readAsText() and readAsDataURL() as methods on the File, rather than a specific separate interface (FileReaderSync)?
Re: Updates to File API
On Thu, May 13, 2010 at 1:50 PM, J Ross Nicoll j...@jrn.me.uk wrote: On 13 May 2010, at 13:27, Arun Ranganathan wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. Why remove the 'type' attribute from the File? Specifically, is there a real issue with duplicating the information in both the File and the Blob? Two main concerns: File inherits Blob, so everything that is available on Blob is available on File. This is similar to how HTMLElement inherits Element. getAttribute is available on HTMLElement, despite being defined on Element. / Jonas
Re: Updates to File API
On 5/13/10 1:50 PM, J Ross Nicoll wrote: On 13 May 2010, at 13:27, Arun Ranganathan wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. Why remove the 'type' attribute from the File? Specifically, is there a real issue with duplicating the information in both the File and the Blob? Two main concerns: Without 'type' in the File attribute, you have to read the file to understand what's in it. This means that if you want to, for example, produce a confirmation dialogue for the user before reading a file, it's very limited in how much information it can show (I also think it would be a good idea to have 'size' as an attribute on the File, for related reasons). At the moment, if a directory is been dragged and dropped into Firefox, the only way to spot this appears to be the 'type' attribute (which is empty). Unless I'm missing something, as written this would appear to mean the JS has to try reading a directory, get a Blob back (which Firefox does do, at least) and then it can find out it didn't actually read a file. Currently, File inherits from Blob. Looking at synchronized file reading... would it perhaps make more sense to have readAsBinaryString(), readAsText() and readAsDataURL() as methods on the File, rather than a specific separate interface (FileReaderSync)? FileReader is for asynchronous reads on the main thread. FileReaderSync is for synchronous reads on worker threads. We want to: 1. Decouple Files from the objects that read from them and 2. Disallow any synchronous File I/O on the main thread. -- A*
Re: Updates to File API
Glad to hear that you didn't intend sync access :-) Can you define the contentType parameter to slice better? Is that intended to correspond to the value of a HTTP Content-Type response header? For example, can the contentType value include a charset attribute? It might be useful to indicate that a slice of a file should be treated as text/html with a specific encoding. I'm also a fan of providing a way to specify optional Content-Disposition parameters in the slice call. It seems to me that Content-Disposition like Content-Type impacts the way that Blob.url might be interpreted. It is useful to enable Blob.url to be able to replicate what you can do with http:// URLs. I think this would make it easier for apps to use http://URLs while online and Blob.url while offline without changing the rest of their code. I'm specifically thinking of use cases like the download links for attachments in webmail apps. Regards, -Darin On Thu, May 13, 2010 at 8:21 AM, Arun Ranganathan a...@mozilla.com wrote: On 5/13/10 7:37 AM, David Levin wrote: On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. Does this imply *sync* access to the blob data? new DataArray(blob.blobBuffer).getInt8(0); A more sensible way is an additional asynchronous read method on FileReader, which is what I should have done in the first place. Here, partial data is going to be an interesting question. While partial strings makes sense (for readAsBinaryString and readAsText), partial ArrayBuffers gets us into a different area altogether. Any thoughts on partial reads here? For now, I've caveated my (pretty major) mistake with an editor's note. I'll update later today with a better way to expose this, but I'm thinking something like readAsArrayBuffer on FileReader (with an open question on partial reads). Also, does it imply the ability to modify the blob contents? (If so, what does this mean when there is a file backing it?) new DataArray(blob.blobBuffer).setInt8(0, 0); I'll let Eric speak to what BlobBuilder might want to do, but I'll strongly disallow it in my draft :) -- A* Thanks, dave