subject:"Updates to File API"

Re: Updates to File API

2010-06-28 Thread Arun Ranganathan


On 6/23/10 9:50 AM, Jian Li wrote:
I think encoding the security origin in the URL allows the UAs to do 
the security origin check in place, without routing through other 
authority to get the origin information that might cause the check 
taking long time to finish.


If we worry about showing the double schemes in the URL, we can 
transform the origin encoded in the URL by using base64 or other 
escaping algorithm.


Jian: the current URL scheme: http://dev.w3.org/2006/webapi/FileAPI/#url 
allows you to do that, without obliging other UAs to do that.  Some UAs 
may elect to use smart caching to accomplish the same kinds of things, 
without tagging the URL with origin information.  Others may see benefit 
in origin-tagging.


I've reconsidered trying to architect a scheme that allows all use-case 
scenarios for blob: URIs.


-- A*


Jian


On Wed, Jun 23, 2010 at 8:24 AM, David Levin le...@google.com 
mailto:le...@google.com wrote:


On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman
adria...@microsoft.com mailto:adria...@microsoft.com wrote:

On Tuesday, June 22, 2010 8:40 PM, David Levin wrote:
 I agree with you Adrian that it makes sense to let the user
agent figure
 out the optimal way of implementing origin and other checks.

 A logical step from that premise is that the choice/format
of the
 namespace specific string should be left up to the UA as
embedding
 information in there may be the optimal way for some UA's of
implementing
 said checks, and it sounds like other UAs may not want to do
that.

Robin outlined why that would be a problem [1]. My original
feeling was that this should be left up to UAs, as you say,
but I've been convinced that doing so is a race to the most
complex URL scheme.


Robin discussed something that could possibly in
http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At
the same time, there are implementors who gave specific reasons
why encoding certain information (scheme, host, port) in
the namespace specific string (NSS) is useful to various UAs. No
other information has been requested, so theories adding more
information seem premature.

If the format must be specified, it seems reasonable to take both
the theoretical and practical issues into account.

Encoding that the security origin in the NSS isn't complex. If a
proposal is needed about how that can be done in a simple way, I'm
willing to supply one. Also, UAs that don't care about that
information are free to ignore it and don't need to parse it.

dave

Re: Updates to File API

2010-06-23 Thread David Levin

On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.comwrote:

 On Tuesday, June 22, 2010 8:40 PM, David Levin wrote:
  I agree with you Adrian that it makes sense to let the user agent figure
  out the optimal way of implementing origin and other checks.
 
  A logical step from that premise is that the choice/format of the
  namespace specific string should be left up to the UA as embedding
  information in there may be the optimal way for some UA's of implementing
  said checks, and it sounds like other UAs may not want to do that.

 Robin outlined why that would be a problem [1]. My original feeling was
 that this should be left up to UAs, as you say, but I've been convinced that
 doing so is a race to the most complex URL scheme.




 Robin discussed something that could possibly in
http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At
the same time, there are implementors who gave specific reasons why encoding
certain information (scheme, host, port) in the namespace specific string
(NSS) is useful to various UAs. No other information has been requested, so
theories adding more information seem premature.


If the format must be specified, it seems reasonable to take both the
theoretical and practical issues into account.


Encoding that the security origin in the NSS isn't complex. If a proposal is
needed about how that can be done in a simple way, I'm willing to supply
one. Also, UAs that don't care about that information are free to ignore it
and don't need to parse it.


dave

Re: Updates to File API

2010-06-23 Thread Jian Li

I think encoding the security origin in the URL allows the UAs to do the
security origin check in place, without routing through other authority to
get the origin information that might cause the check taking long time to
finish.

If we worry about showing the double schemes in the URL, we can transform
the origin encoded in the URL by using base64 or other escaping algorithm.

Jian


On Wed, Jun 23, 2010 at 8:24 AM, David Levin le...@google.com wrote:

 On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.comwrote:

 On Tuesday, June 22, 2010 8:40 PM, David Levin wrote:
  I agree with you Adrian that it makes sense to let the user agent figure
  out the optimal way of implementing origin and other checks.
 
  A logical step from that premise is that the choice/format of the
  namespace specific string should be left up to the UA as embedding
  information in there may be the optimal way for some UA's of
 implementing
  said checks, and it sounds like other UAs may not want to do that.

 Robin outlined why that would be a problem [1]. My original feeling was
 that this should be left up to UAs, as you say, but I've been convinced that
 doing so is a race to the most complex URL scheme.




 Robin discussed something that could possibly in
 http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At
 the same time, there are implementors who gave specific reasons why encoding
 certain information (scheme, host, port) in the namespace specific string
 (NSS) is useful to various UAs. No other information has been requested, so
 theories adding more information seem premature.


 If the format must be specified, it seems reasonable to take both the
 theoretical and practical issues into account.


 Encoding that the security origin in the NSS isn't complex. If a proposal
 is needed about how that can be done in a simple way, I'm willing to supply
 one. Also, UAs that don't care about that information are free to ignore it
 and don't need to parse it.


 dave

RE: Updates to File API

2010-06-22 Thread Adrian Bateman

On Friday, June 11, 2010 11:18 AM, Jonas Sicking wrote:
 On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote:
  On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com
  It's not clear to me the benefit of encoding the origin into the URL. Do
  we expect script to parse out the origin and use it? Even in a 
  multi-process
  architecture there's presumably some central store of issued URLs which 
  will
  need to store origin information as well as other things?
 
  The one advantage I can see is that putting the scheme into the URL
  allows the *implementation* to deduce the origin by simply looking at
  the URL-scheme. This avoids having to do a (potentially cross-process)
  lookup to get the origin.
 
  This could be useful for APIs which have to synchronously determine
  the origin of a given URL in order to throw an exception on an
  attempted cross-origin access. For example an XMLHttpRequest Level 1
  implementation needs to synchronously determine if it should make a
  call to .open(...) throw or not based on the origin of the passed in
  URL.
 
  However I'm not sure if this is a problem in practice or not. It's
  entierly possible that the web platform is littered with situations
  where you need to do synchronous communication with whichever thread
  the networking code runs on.
 
  Firefox is still in the process of going multi-process, so I'll defer
  to other browsers with more experience in this area.
 
 Oh, and I should add that the implementation will of course still have
 to check once a url is loaded that the origin in the url matches the
 origin in whatever map is used to map urls to resources. I.e. if the
 implementation has handed out a url like:
 
 filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752
 
 and script changes that to:
 
 filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752
 
 then attempting to load the latter url should result in a 404 or similar.

Since the origin requires scheme as well as hostname/port it seems like we'll
end up with some encoding or parsing complexity by following this approach. 
Robin
gave good reasons for not allowing user agents to encode data into the URL
and I'm not convinced that including origin for this particular case isn't
a premature optimisation. At what point will we find other data that's
convenient to have encoded in the URL?

I think it makes more sense for the URL to be opaque and let user agents figure
out the optimal way of implementing origin and other checks.

Cheers,

Adrian.

Re: Updates to File API

2010-06-22 Thread Arun Ranganathan


On 6/22/10 8:44 AM, Adrian Bateman wrote:

On Friday, June 11, 2010 11:18 AM, Jonas Sicking wrote:
   

On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sickingjo...@sicking.cc  wrote:
 

On Fri, Jun 11, 2010 at 9:09 AM, Adrian Batemanadria...@microsoft.com
   

It's not clear to me the benefit of encoding the origin into the URL. Do
we expect script to parse out the origin and use it? Even in a multi-process
architecture there's presumably some central store of issued URLs which will
need to store origin information as well as other things?
 

The one advantage I can see is that putting the scheme into the URL
allows the *implementation* to deduce the origin by simply looking at
the URL-scheme. This avoids having to do a (potentially cross-process)
lookup to get the origin.

This could be useful for APIs which have to synchronously determine
the origin of a given URL in order to throw an exception on an
attempted cross-origin access. For example an XMLHttpRequest Level 1
implementation needs to synchronously determine if it should make a
call to .open(...) throw or not based on the origin of the passed in
URL.

However I'm not sure if this is a problem in practice or not. It's
entierly possible that the web platform is littered with situations
where you need to do synchronous communication with whichever thread
the networking code runs on.

Firefox is still in the process of going multi-process, so I'll defer
to other browsers with more experience in this area.
   

Oh, and I should add that the implementation will of course still have
to check once a url is loaded that the origin in the url matches the
origin in whatever map is used to map urls to resources. I.e. if the
implementation has handed out a url like:

filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752

and script changes that to:

filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752

then attempting to load the latter url should result in a 404 or similar.
 

Since the origin requires scheme as well as hostname/port it seems like we'll
end up with some encoding or parsing complexity by following this approach.


Upon reflection, I agree with Adrian.  Origin requires:

1. Scheme
2. Hostname
3. Port
4. Certificates, if any

This creates untenable complexity.


Robin
gave good reasons for not allowing user agents to encode data into the URL
and I'm not convinced that including origin for this particular case isn't
a premature optimisation. At what point will we find other data that's
convenient to have encoded in the URL?
   


+1.

I think it makes more sense for the URL to be opaque and let user agents figure
out the optimal way of implementing origin and other checks.
   


I think it may be important to define:

* Format.  I agree that this could be something simple, but it should be 
defined.  By opaque, do you mean undefined?
* Behavior with GET.  For this, I propose using a subset of HTTP/1.1 
responses.


-- A*

RE: Updates to File API

2010-06-22 Thread Adrian Bateman

On Tuesday, June 22, 2010 3:37 PM, Arun Ranganathan wrote:
 On 6/22/10 8:44 AM, Adrian Bateman wrote:
  I think it makes more sense for the URL to be opaque and let user
  agents figure
  out the optimal way of implementing origin and other checks.
 
 I think it may be important to define:
 
 * Format.  I agree that this could be something simple, but it should be
 defined.  By opaque, do you mean undefined?
 * Behavior with GET.  For this, I propose using a subset of HTTP/1.1
 responses.

I think we agree. I actually meant well-defined but opaque to JavaScript
consumers. In other words script in a web page can't deduce any meaningful
information from the string. If we're aiming for that property then it
makes sense that the entire scheme be defined (something like
filedata:----000). We can bikeshed the scheme
name later but I'd prefer something more generic now url is off Blob.

I agree that there should be HTTP/1.1 response codes for GET.

Cheers,

Adrian.

Re: Updates to File API

2010-06-22 Thread David Levin

On Tue, Jun 22, 2010 at 7:58 PM, Adrian Bateman adria...@microsoft.comwrote:

 On Tuesday, June 22, 2010 3:37 PM, Arun Ranganathan wrote:
  On 6/22/10 8:44 AM, Adrian Bateman wrote:
   I think it makes more sense for the URL to be opaque and let user
   agents figure
   out the optimal way of implementing origin and other checks.


  I think it may be important to define:
 
  * Format.  I agree that this could be something simple, but it should be
  defined.  By opaque, do you mean undefined?
  * Behavior with GET.  For this, I propose using a subset of HTTP/1.1
  responses.

 I think we agree. I actually meant well-defined but opaque to JavaScript
 consumers. In other words script in a web page can't deduce any meaningful
 information from the string. If we're aiming for that property then it
 makes sense that the entire scheme be defined (something like
 filedata:----000).



I agree with you Adrian that it makes sense to let the user agent figure out
the optimal way of implementing origin and other checks.

A logical step from that premise is that the choice/format of the namespace
specific string should be left up to the UA as embedding information in
there may be the optimal way for some UA's of implementing said checks, and
it sounds like other UAs may not want to do that.

dave

RE: Updates to File API

2010-06-22 Thread Adrian Bateman

On Tuesday, June 22, 2010 8:40 PM, David Levin wrote:
 I agree with you Adrian that it makes sense to let the user agent figure
 out the optimal way of implementing origin and other checks.
 
 A logical step from that premise is that the choice/format of the
 namespace specific string should be left up to the UA as embedding
 information in there may be the optimal way for some UA's of implementing
 said checks, and it sounds like other UAs may not want to do that.

Robin outlined why that would be a problem [1]. My original feeling was that 
this should be left up to UAs, as you say, but I've been convinced that doing 
so is a race to the most complex URL scheme.

Cheers,

Adrian.

[1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html

Re: Updates to File API

2010-06-14 Thread Jonas Sicking

On Sun, Jun 13, 2010 at 10:46 PM, Mark Seaborn mseab...@chromium.org wrote:
 On Wed, Jun 2, 2010 at 5:06 PM, Jian Li jia...@chromium.org wrote:

 I have one question regarding the scheme for Blob.url. The latest spec
 says that The proposed URL scheme is filedata:. Mozilla already ships with
 moz-filedata:. Since the URL is now part of the Blob and it could be used
 to refer to both file data blob and binary data blob, should we consider
 making the scheme as blobdata: for better generalization? In addition,
 we're thinking it will probably be a good practice to encode the security
 origin in the blob URL scheme, like
 blobdata:http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will
 make doing the security origin check easier when a page tries to access the
 blob url that is created in another process, under multi-process
 architecture.

 Why do the filedata: URLs need to apply a same-origin check?  It seems
 like this would unnecessarily reduce composability.  In practice, the URLs
 returned by the File API would be unguessable anyway.  Why not use
 unguessability of these tokens as the security mechanism?  So if a web app
 wants to share the file with other, co-operating entities (e.g. in an iframe
 or another tab), it can do so by sharing the URL; otherwise, it can withhold
 the URL.

 When would the currently-proposed same-origin checks apply?  Would I be
 right in thinking that they only apply to XMLHttpRequests from Javascript,
 and don't apply if the URL is linked from an img element?

URLs weren't always designed to be security sensitive. One common way
they leak is through the referer (sic) header. So if the File whose
.url you were loading was an HTML file, and you loaded it using an
iframe, you could very easily leak the URL to untrusted parties.

At the very least the same-origin check applies such that if a
cross-origin file uri is used on img, this would be considered a
cross-origin load if that img was later pasted into a canvas.
Similarly, if a cross-origin video is loaded, I think some events
are with-held, though I'm less sure about that.

However, in firefox we've taken a more strict approach. We disallow
cross-origin img loads for filedata URIs. I.e. if site A receives a
File with a url. Then even if site B manages to get hold of that url,
it can't use img to load it.

This definitely needs to be specified in spec though.

/ Jonas

Re: Updates to File API

2010-06-14 Thread Jonas Sicking

On Mon, Jun 14, 2010 at 11:35 AM, Mark Seaborn mseab...@chromium.org wrote:
 On Mon, Jun 14, 2010 at 12:40 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Sun, Jun 13, 2010 at 10:46 PM, Mark Seaborn mseab...@chromium.org
 wrote:
  Why do the filedata: URLs need to apply a same-origin check?  It seems
  like this would unnecessarily reduce composability.  In practice, the
  URLs
  returned by the File API would be unguessable anyway.  Why not use
  unguessability of these tokens as the security mechanism?  So if a web
  app
  wants to share the file with other, co-operating entities (e.g. in an
  iframe
  or another tab), it can do so by sharing the URL; otherwise, it can
  withhold
  the URL.
 
  When would the currently-proposed same-origin checks apply?  Would I be
  right in thinking that they only apply to XMLHttpRequests from
  Javascript,
  and don't apply if the URL is linked from an img element?

 URLs weren't always designed to be security sensitive. One common way
 they leak is through the referer (sic) header. So if the File whose
 .url you were loading was an HTML file, and you loaded it using an
 iframe, you could very easily leak the URL to untrusted parties.

 That's true for http: URLs, but AFAIK it's not true for https: URLs.
 The browser is not supposed to disclose HTTPS URLs via the Referer header,
 and I know of at least one app (Tahoe-LAFS) that relies on that.  Since the
 File API is creating the new filedata: URL scheme, it can specify that it
 has the same property.

I'd still be extremely worried that it's much too easy to leak the
URLs. Additionally, it's always risky to pass a filedata url to
another page as the lifetime of the filedata url is bound to the
document that made the call to File.url.

Instead it's better to pass the File object around, through for
example postMessage, and let every page that needs it request the url.

 At the very least the same-origin check applies such that if a
 cross-origin file uri is used on img, this would be considered a
 cross-origin load if that img was later pasted into a canvas.
 Similarly, if a cross-origin video is loaded, I think some events
 are with-held, though I'm less sure about that.

 However, in firefox we've taken a more strict approach. We disallow
 cross-origin img loads for filedata URIs. I.e. if site A receives a
 File with a url. Then even if site B manages to get hold of that url,
 it can't use img to load it.

 This is adding a new mechanism, isn't it, since img was previously
 considered to be normal linking and did not have a same-origin check.

 What would happen if a page has an a href=filedata:... link?

That's a good question. I think the result of our implementation is
that the navigation is prevented, similar to how firefox prevents
navigating to a link like a href=file://some/local/fs/path. I.e.
we don't change the DOM, however if the user clicks the link nothing
happens.

/ Jonas

Re: Updates to File API

2010-06-13 Thread timeless

On Fri, Jun 11, 2010 at 10:04 PM, Michael Nordman micha...@google.com wrote:
 Another advantage is that...
 blobdata://http_responsible_party.org:80/3699b4a0-e43e-4cec-b87b-82b6f83dd752

 ... makes it clear to the end user who the responsible party is when these
 urls are visible in the user interface. (location bar, tooltips, etc).

It doesn't, it just means yet another way for scripts to confuse the user.

Every time we provide a string whose domain is in control of a domain,
the set of evil uses increases as evil groups set up more interesting
domains and trick users for another two or three years.

With browsers targeting smaller devices, as well as users who are less
familiar with the web, or even experienced users who missed memos
about IDN, these improvements just cause more problems.

Tab: I'd like to specifically call you out for your inclusion of:
 http://www.詹姆斯.com/blog/2010/06/html5-atom-gone-wrong, a comparison
in a recent email.  .COM does not allow IDN and you should not have
used that. I know someone was being cute, but that doesn't justify
confusing users. I don't have time to construct a similarly written
domain which happens to go to my own spoof, nor am I going to invest
the ~9 USD that it would cost to do so, but it is perfectly reasonable
for someone else to do so. The time it would take is probably around
10mins including picking a similar character, registering the domain,
and posting content.

It's true that this spoof would not fool all of the people all of the
time, but it would probably fool most of the people most of the time.

Re: Updates to File API

2010-06-13 Thread Mark Seaborn

On Wed, Jun 2, 2010 at 5:06 PM, Jian Li jia...@chromium.org wrote:

 I have one question regarding the scheme for Blob.url. The latest spec says
 that The proposed URL scheme is filedata:. Mozilla already ships with
 moz-filedata:. Since the URL is now part of the Blob and it could be used
 to refer to both file data blob and binary data blob, should we consider
 making the scheme as blobdata: for better generalization? In addition,
 we're thinking it will probably be a good practice to encode the security
 origin in the blob URL scheme, like blobdata:
 http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
 doing the security origin check easier when a page tries to access the blob
 url that is created in another process, under multi-process architecture.


Why do the filedata: URLs need to apply a same-origin check?  It seems
like this would unnecessarily reduce composability.  In practice, the URLs
returned by the File API would be unguessable anyway.  Why not use
unguessability of these tokens as the security mechanism?  So if a web app
wants to share the file with other, co-operating entities (e.g. in an iframe
or another tab), it can do so by sharing the URL; otherwise, it can withhold
the URL.

When would the currently-proposed same-origin checks apply?  Would I be
right in thinking that they only apply to XMLHttpRequests from Javascript,
and don't apply if the URL is linked from an img element?

Regards,
Mark

RE: Updates to File API

2010-06-11 Thread Adrian Bateman

On Wednesday, June 02, 2010 5:27 PM, Arun Ranganathan wrote:
 On 6/2/10 5:06 PM, Jian Li wrote:
  Indeed, the URL scheme seems to be more sort of implementation details.
  Different browser vendors can choose the appropriate scheme, like Mozilla
  ships with moz-filedata. How do you think?
 
 Actually, I'm against leaving it totally up to implementations.  Sure,
 the spec. could simply state how the URL behaves without mentioning
 format much, but we identified in the past [1] that it was wise to
 specify things reliably, so that developers didn't rely on arbitrary
 behavior in one implementation and expect something similar in another.
 It's precisely that genre of underspecified behavior that got us in
 trouble before ;-)
 
 -- A*
 [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html

Do you think the URL scheme should be specified for each use of Blob or more 
broadly? For example, Blob is used in the File Reader API but also possibly in 
the Capture API in a different way. It might be useful to be able to use a 
different scheme for these different purposes to help the user agent route 
requests to the appropriate handler.

Adrian.

RE: Updates to File API

2010-06-11 Thread Adrian Bateman

On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote:
 On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote:
  On 6/2/10 5:06 PM, Jian Li wrote:
  In addition,
  we're thinking it will probably be a good practice to encode the security
  origin in the blob URL scheme, like blobdata:
  http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
  doing the security origin check easier when a page tries to access the
  blob
  url that is created in another process, under multi-process architecture.
 
  This is a good suggestion.  I particularly like the idea of encoding the
  origin as part of the scheme.
 
 Though we want to avoid introducing the concept of nested schemes to
 the web. While mozilla already uses nested schemes (jar:http://...
 and  view-source:http://...) I know others, in particular Apple, have
 expressed a dislike for this in the past. And with good reason, it's
 not easy to implement and has been a source of numerous security bugs.
 That said, it's certainly possible.

It's not clear to me the benefit of encoding the origin into the URL. Do we 
expect script to parse out the origin and use it? Even in a multi-process 
architecture there's presumably some central store of issued URLs which will 
need to store origin information as well as other things?

Cheers,

Adrian

Re: Updates to File API

2010-06-11 Thread Jian Li

One benefit of using the encoded origin is to do the security origin check
in place, instead of resorting to a centralized authority, esp. under
multi-process architecture. Considering getting and checking the origin
before hitting the cache for the blob.url item.


On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.comwrote:

 On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote:
  On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com
 wrote:
   On 6/2/10 5:06 PM, Jian Li wrote:
   In addition,
   we're thinking it will probably be a good practice to encode the
 security
   origin in the blob URL scheme, like blobdata:
   http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will
 make
   doing the security origin check easier when a page tries to access the
   blob
   url that is created in another process, under multi-process
 architecture.
  
   This is a good suggestion.  I particularly like the idea of encoding
 the
   origin as part of the scheme.
 
  Though we want to avoid introducing the concept of nested schemes to
  the web. While mozilla already uses nested schemes (jar:http://...
  and  view-source:http://...) I know others, in particular Apple, have
  expressed a dislike for this in the past. And with good reason, it's
  not easy to implement and has been a source of numerous security bugs.
  That said, it's certainly possible.

 It's not clear to me the benefit of encoding the origin into the URL. Do we
 expect script to parse out the origin and use it? Even in a multi-process
 architecture there's presumably some central store of issued URLs which will
 need to store origin information as well as other things?

 Cheers,

 Adrian

Re: Updates to File API

2010-06-11 Thread Jonas Sicking

On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com wrote:
 On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote:
 On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote:
  On 6/2/10 5:06 PM, Jian Li wrote:
  In addition,
  we're thinking it will probably be a good practice to encode the security
  origin in the blob URL scheme, like blobdata:
  http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
  doing the security origin check easier when a page tries to access the
  blob
  url that is created in another process, under multi-process architecture.
 
  This is a good suggestion.  I particularly like the idea of encoding the
  origin as part of the scheme.

 Though we want to avoid introducing the concept of nested schemes to
 the web. While mozilla already uses nested schemes (jar:http://...
 and  view-source:http://...) I know others, in particular Apple, have
 expressed a dislike for this in the past. And with good reason, it's
 not easy to implement and has been a source of numerous security bugs.
 That said, it's certainly possible.

 It's not clear to me the benefit of encoding the origin into the URL. Do we 
 expect script to parse out the origin and use it? Even in a multi-process 
 architecture there's presumably some central store of issued URLs which will 
 need to store origin information as well as other things?

The one advantage I can see is that putting the scheme into the URL
allows the *implementation* to deduce the origin by simply looking at
the URL-scheme. This avoids having to do a (potentially cross-process)
lookup to get the origin.

This could be useful for APIs which have to synchronously determine
the origin of a given URL in order to throw an exception on an
attempted cross-origin access. For example an XMLHttpRequest Level 1
implementation needs to synchronously determine if it should make a
call to .open(...) throw or not based on the origin of the passed in
URL.

However I'm not sure if this is a problem in practice or not. It's
entierly possible that the web platform is littered with situations
where you need to do synchronous communication with whichever thread
the networking code runs on.

Firefox is still in the process of going multi-process, so I'll defer
to other browsers with more experience in this area.

/ Jonas

Re: Updates to File API

2010-06-11 Thread Jonas Sicking

On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com 
 wrote:
 On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote:
 On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote:
  On 6/2/10 5:06 PM, Jian Li wrote:
  In addition,
  we're thinking it will probably be a good practice to encode the security
  origin in the blob URL scheme, like blobdata:
  http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
  doing the security origin check easier when a page tries to access the
  blob
  url that is created in another process, under multi-process architecture.
 
  This is a good suggestion.  I particularly like the idea of encoding the
  origin as part of the scheme.

 Though we want to avoid introducing the concept of nested schemes to
 the web. While mozilla already uses nested schemes (jar:http://...
 and  view-source:http://...) I know others, in particular Apple, have
 expressed a dislike for this in the past. And with good reason, it's
 not easy to implement and has been a source of numerous security bugs.
 That said, it's certainly possible.

 It's not clear to me the benefit of encoding the origin into the URL. Do we 
 expect script to parse out the origin and use it? Even in a multi-process 
 architecture there's presumably some central store of issued URLs which will 
 need to store origin information as well as other things?

 The one advantage I can see is that putting the scheme into the URL
 allows the *implementation* to deduce the origin by simply looking at
 the URL-scheme. This avoids having to do a (potentially cross-process)
 lookup to get the origin.

 This could be useful for APIs which have to synchronously determine
 the origin of a given URL in order to throw an exception on an
 attempted cross-origin access. For example an XMLHttpRequest Level 1
 implementation needs to synchronously determine if it should make a
 call to .open(...) throw or not based on the origin of the passed in
 URL.

 However I'm not sure if this is a problem in practice or not. It's
 entierly possible that the web platform is littered with situations
 where you need to do synchronous communication with whichever thread
 the networking code runs on.

 Firefox is still in the process of going multi-process, so I'll defer
 to other browsers with more experience in this area.

Oh, and I should add that the implementation will of course still have
to check once a url is loaded that the origin in the url matches the
origin in whatever map is used to map urls to resources. I.e. if the
implementation has handed out a url like:

filedata:sheep.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752

and script changes that to:

filedata:wolf.org/3699b4a0-e43e-4cec-b87b-82b6f83dd752

then attempting to load the latter url should result in a 404 or similar.

/ Jonas

Re: Updates to File API

2010-06-11 Thread Michael Nordman

Another advantage is that...

blobdata://
http_responsible_party.org:80/3699b4a0-e43e-4cec-b87b-82b6f83dd752

... makes it clear to the end user who the responsible party is when these
urls are visible in the user interface. (location bar, tooltips, etc).

On Fri, Jun 11, 2010 at 11:11 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.com
 wrote:
  On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote:
  On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com
 wrote:
   On 6/2/10 5:06 PM, Jian Li wrote:
   In addition,
   we're thinking it will probably be a good practice to encode the
 security
   origin in the blob URL scheme, like blobdata:
   http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will
 make
   doing the security origin check easier when a page tries to access
 the
   blob
   url that is created in another process, under multi-process
 architecture.
  
   This is a good suggestion.  I particularly like the idea of encoding
 the
   origin as part of the scheme.
 
  Though we want to avoid introducing the concept of nested schemes to
  the web. While mozilla already uses nested schemes (jar:http://...
  and  view-source:http://...) I know others, in particular Apple, have
  expressed a dislike for this in the past. And with good reason, it's
  not easy to implement and has been a source of numerous security bugs.
  That said, it's certainly possible.
 
  It's not clear to me the benefit of encoding the origin into the URL. Do
 we expect script to parse out the origin and use it? Even in a multi-process
 architecture there's presumably some central store of issued URLs which will
 need to store origin information as well as other things?

 The one advantage I can see is that putting the scheme into the URL
 allows the *implementation* to deduce the origin by simply looking at
 the URL-scheme. This avoids having to do a (potentially cross-process)
 lookup to get the origin.

 This could be useful for APIs which have to synchronously determine
 the origin of a given URL in order to throw an exception on an
 attempted cross-origin access. For example an XMLHttpRequest Level 1
 implementation needs to synchronously determine if it should make a
 call to .open(...) throw or not based on the origin of the passed in
 URL.

 However I'm not sure if this is a problem in practice or not. It's
 entierly possible that the web platform is littered with situations
 where you need to do synchronous communication with whichever thread
 the networking code runs on.

 Firefox is still in the process of going multi-process, so I'll defer
 to other browsers with more experience in this area.

 / Jonas

Re: Updates to File API

2010-06-02 Thread Eric Uhrhane

Arun:

In the latest version of the spec I see that readAsDataURL, alone
among the readAs* methods, still takes a File rather than a Blob. Is
that just an oversight, or is that an intentional restriction?

Eric

On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathan a...@mozilla.com wrote:
Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes that
have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an ArrayBuffer
property that represents the Blob. ArrayBuffers, and affiliated Typed
Array views of data, are specified in a working draft as a part of the
WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We
intend to implement some of this in the Firefox 4 timeframe, and have reason
to believe other browsers will as well. I have thus cited the work as a
normative reference [1]. Eventually, we ought to consider further read
operations given ArrayBuffers, but for now, I believe exposing Blobs in this
way is sufficient.

2. url and type properties have been moved to to the underlying Blob
interface. Notably, the property is now called 'url' and not 'urn.' Use
cases for triggering 'save as' behavior with Content-Disposition have not
been addressed[2], although I believe that with FileWriter and
BlobBuilder[3] they may be addressed differently. This change reflects
lengthy discussion (e.g. start here[4])

3. The renaming of the property to 'url' also suggests that we should cease
to consider an urn:uuid scheme. I solicited implementer feedback about URLs
vs. URNs in general. There was a general preference to URLs[5], though this
wasn't a strong preference. Moreover, Mozilla's implementation currently
uses moz-filedata: . The current draft has an editor's note about the use
of HTTP semantics, and origin issues in the context of shared workers. This
is work in progress; I have removed the section specifying urn:uuid and hope
to have an update with a section covering the filedata: scheme (with
filedata:uuid as a suggestion). I welcome discussion about this. I'll
point out that we are coining a new scheme, which we originally sought to
avoid :-)

4. I have changed event order; loadend now fires after an error event [6].

-- A*

[1]
https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
[2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html
[3] http://dev.w3.org/2009/dap/file-system/file-writer.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html
[6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html

Re: Updates to File API

2010-06-02 Thread Arun Ranganathan


On 6/2/10 3:42 PM, Eric Uhrhane wrote:

Arun:

In the latest version of the spec I see that readAsDataURL, alone
among the readAs* methods, still takes a File rather than a Blob.  Is
that just an oversight, or is that an intentional restriction?
   


That's intentional; readAsDataURL was cited as useful only in the 
context of File objects.  Do you think it makes sense in the context of 
random Blob objects?  Does it make sense on slice calls on a Blob, for 
example?


-- A*

Re: Updates to File API

2010-06-02 Thread Eric Uhrhane

On Wed, Jun 2, 2010 at 3:44 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 6/2/10 3:42 PM, Eric Uhrhane wrote:

 Arun:

 In the latest version of the spec I see that readAsDataURL, alone
 among the readAs* methods, still takes a File rather than a Blob.  Is
 that just an oversight, or is that an intentional restriction?


 That's intentional; readAsDataURL was cited as useful only in the context of
 File objects.  Do you think it makes sense in the context of random Blob
 objects?  Does it make sense on slice calls on a Blob, for example?

Sure, why not?  Why would this be limited to File objects?

A File is supposed to refer to an actual file on the local hard drive.
 A Blob is a big bunch of data that you might want to do something
with.  There's nothing special about a File when it comes to what
you're doing with the data.

Just as we moved File.url up to Blob, I think File.readAsDataURL
belongs there too.

Re: Updates to File API

2010-06-02 Thread Arun Ranganathan


On 6/2/10 3:48 PM, Eric Uhrhane wrote:

Sure, why not?  Why would this be limited to File objects?

A File is supposed to refer to an actual file on the local hard drive.
  A Blob is a big bunch of data that you might want to do something
with.  There's nothing special about a File when it comes to what
you're doing with the data.

Just as we moved File.url up to Blob, I think File.readAsDataURL
belongs there too.
   


Fair enough; I'm amenable to moving it.  So specifically, you're okay 
with a DataURL on a Blob?  It might not be anything useful; with a File, 
you at least have the possibility of a whole unsliced image file.  Could 
you give me a use case where this is really useful for Blob objects?


Also, above you probably mean specifying that readAsDataURL (a method on 
FileReader) works on Blob objects, not File.readAsDataURL ;-)


-- A*

Re: Updates to File API

2010-06-02 Thread Jian Li

On Wed, Jun 2, 2010 at 3:48 PM, Eric Uhrhane er...@google.com wrote:

 On Wed, Jun 2, 2010 at 3:44 PM, Arun Ranganathan a...@mozilla.com wrote:
  On 6/2/10 3:42 PM, Eric Uhrhane wrote:
 
  Arun:
 
  In the latest version of the spec I see that readAsDataURL, alone
  among the readAs* methods, still takes a File rather than a Blob.  Is
  that just an oversight, or is that an intentional restriction?
 
 
  That's intentional; readAsDataURL was cited as useful only in the context
 of
  File objects.  Do you think it makes sense in the context of random Blob
  objects?  Does it make sense on slice calls on a Blob, for example?

 Sure, why not?  Why would this be limited to File objects?

 A File is supposed to refer to an actual file on the local hard drive.
  A Blob is a big bunch of data that you might want to do something
 with.  There's nothing special about a File when it comes to what
 you're doing with the data.

 Just as we moved File.url up to Blob, I think File.readAsDataURL
 belongs there too.


And we move type from File to Blob.

Re: Updates to File API

2010-06-02 Thread Eric Uhrhane

On Wed, Jun 2, 2010 at 3:57 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 6/2/10 3:48 PM, Eric Uhrhane wrote:

 Sure, why not?  Why would this be limited to File objects?

 A File is supposed to refer to an actual file on the local hard drive.
  A Blob is a big bunch of data that you might want to do something
 with.  There's nothing special about a File when it comes to what
 you're doing with the data.

 Just as we moved File.url up to Blob, I think File.readAsDataURL
 belongs there too.


 Fair enough; I'm amenable to moving it.  So specifically, you're okay with a
 DataURL on a Blob?  It might not be anything useful; with a File, you at
 least have the possibility of a whole unsliced image file.  Could you give
 me a use case where this is really useful for Blob objects?

One that's come up for Blob.url is a packed file of image thumbnails:
you can do one big download, then slice and display the pieces.  If
you're doing any display by data URLs, that would work there too. To
be honest, I think a lot of the data URL use cases are better served
by Blob.url anyway, so I'm not sure how many will remain once this
spec is fully implemented, but can you think of a data URL use case
that really depends on the data coming from a File on disk instead of
a  Blob?

 Also, above you probably mean specifying that readAsDataURL (a method on
 FileReader) works on Blob objects, not File.readAsDataURL ;-)

Yeah.  Brain-o.

Re: Updates to File API

2010-06-02 Thread Jian Li

Hi, Arun,

I have one question regarding the scheme for Blob.url. The latest spec says
that The proposed URL scheme is filedata:. Mozilla already ships with
moz-filedata:. Since the URL is now part of the Blob and it could be used
to refer to both file data blob and binary data blob, should we consider
making the scheme as blobdata: for better generalization? In addition,
we're thinking it will probably be a good practice to encode the security
origin in the blob URL scheme, like blobdata:
http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
doing the security origin check easier when a page tries to access the blob
url that is created in another process, under multi-process architecture.

Indeed, the URL scheme seems to be more sort of implementation details.
Different browser vendors can choose the appropriate scheme, like Mozilla
ships with moz-filedata. How do you think?

Jian

On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathan a...@mozilla.com wrote:

Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes that
have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an
ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated
Typed Array views of data, are specified in a working draft as a part of
the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well.
We intend to implement some of this in the Firefox 4 timeframe, and have
reason to believe other browsers will as well. I have thus cited the work
as a normative reference [1]. Eventually, we ought to consider further read
operations given ArrayBuffers, but for now, I believe exposing Blobs in this
way is sufficient.

4. I have changed event order; loadend now fires after an error event [6].

-- A*

[1]
https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
[2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html
[3] http://dev.w3.org/2009/dap/file-system/file-writer.html
[4]
http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html
[5]
http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html
[6]
http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html

Re: Updates to File API

2010-06-02 Thread Jonas Sicking

On Wed, Jun 2, 2010 at 3:42 PM, Eric Uhrhane er...@google.com wrote:
 Arun:

 In the latest version of the spec I see that readAsDataURL, alone
 among the readAs* methods, still takes a File rather than a Blob.  Is
 that just an oversight, or is that an intentional restriction?

Having readAsDataURL take a File made sense when .url and .type lived
on File rather than Blob. Now that Blobs have .types I agree that
readAsDataURL should be able to read from a Blob.

But, as you say, I think .url solves most of the use cases. One
usecase I can still think of is a web based HTML editor which allows
inserting images. These images could be included inline in the main
document using data-urls which allows the document to be saved/sent as
a single document, rather than HTML + a pile of images.

/ Jonas

Re: Updates to File API

2010-06-02 Thread Arun Ranganathan


On 6/2/10 5:06 PM, Jian Li wrote:

Hi, Arun,

I have one question regarding the scheme for Blob.url. The latest spec says
that The proposed URL scheme is filedata:. Mozilla already ships with
moz-filedata:. Since the URL is now part of the Blob and it could be used
to refer to both file data blob and binary data blob, should we consider
making the scheme as blobdata: for better generalization? In addition,
we're thinking it will probably be a good practice to encode the security
origin in the blob URL scheme, like blobdata:
http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
doing the security origin check easier when a page tries to access the blob
url that is created in another process, under multi-process architecture.
   


This is a good suggestion.  I particularly like the idea of encoding the 
origin as part of the scheme.

Indeed, the URL scheme seems to be more sort of implementation details.
Different browser vendors can choose the appropriate scheme, like Mozilla
ships with moz-filedata. How do you think?
   


Actually, I'm against leaving it totally up to implementations.  Sure, 
the spec. could simply state how the URL behaves without mentioning 
format much, but we identified in the past [1] that it was wise to 
specify things reliably, so that developers didn't rely on arbitrary 
behavior in one implementation and expect something similar in another.  
It's precisely that genre of underspecified behavior that got us in 
trouble before ;-)


-- A*
[1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html

Re: Updates to File API

2010-06-02 Thread Jian Li

I got what you mean. Thanks for clarifying it.

Do you plan to add the origin encoding into the spec? How about using more
generic scheme name blobdata:?

Jian


On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote:

 On 6/2/10 5:06 PM, Jian Li wrote:

 Hi, Arun,

 I have one question regarding the scheme for Blob.url. The latest spec
 says
 that The proposed URL scheme is filedata:. Mozilla already ships with
 moz-filedata:. Since the URL is now part of the Blob and it could be used
 to refer to both file data blob and binary data blob, should we consider
 making the scheme as blobdata: for better generalization? In addition,
 we're thinking it will probably be a good practice to encode the security
 origin in the blob URL scheme, like blobdata:
 http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
 doing the security origin check easier when a page tries to access the
 blob
 url that is created in another process, under multi-process architecture.



 This is a good suggestion.  I particularly like the idea of encoding the
 origin as part of the scheme.

  Indeed, the URL scheme seems to be more sort of implementation details.
 Different browser vendors can choose the appropriate scheme, like Mozilla
 ships with moz-filedata. How do you think?



 Actually, I'm against leaving it totally up to implementations.  Sure, the
 spec. could simply state how the URL behaves without mentioning format much,
 but we identified in the past [1] that it was wise to specify things
 reliably, so that developers didn't rely on arbitrary behavior in one
 implementation and expect something similar in another.  It's precisely that
 genre of underspecified behavior that got us in trouble before ;-)

 -- A*
 [1]
 http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html

Re: Updates to File API

2010-06-02 Thread Jonas Sicking

On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote:
 On 6/2/10 5:06 PM, Jian Li wrote:

 Hi, Arun,

 I have one question regarding the scheme for Blob.url. The latest spec
 says
 that The proposed URL scheme is filedata:. Mozilla already ships with
 moz-filedata:. Since the URL is now part of the Blob and it could be used
 to refer to both file data blob and binary data blob, should we consider
 making the scheme as blobdata: for better generalization? In addition,
 we're thinking it will probably be a good practice to encode the security
 origin in the blob URL scheme, like blobdata:
 http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make
 doing the security origin check easier when a page tries to access the
 blob
 url that is created in another process, under multi-process architecture.


 This is a good suggestion.  I particularly like the idea of encoding the
 origin as part of the scheme.

Though we want to avoid introducing the concept of nested schemes to
the web. While mozilla already uses nested schemes (jar:http://...
and  view-source:http://...) I know others, in particular Apple, have
expressed a dislike for this in the past. And with good reason, it's
not easy to implement and has been a source of numerous security bugs.
That said, it's certainly possible.

/ Jonas

Re: Updates to File API

2010-05-21 Thread Robin Berjon

On May 21, 2010, at 00:41 , Jonas Sicking wrote:
 On Thu, May 20, 2010 at 2:53 PM, Nathan nat...@webr3.org wrote:
 If the scope of the identifiers is limited to a single ua, on a single
 machine, and specific to that single ua (as in I can't expect to request the
 identifier outside of the ua that provided it on x machine and get the same
 results) then I (personally) can't see why there's a need for anything more
 than a simple unique identifier (sha1 or suchlike)
 Note that the important point of these URNs isn't that they are
 identifiers, but rather that you can point a iframe.src, or a
 img.src, or a #myElement { background-url: url(...) } at them.

Right, and to further Jonas's explanation, imagine .url (or .id, or whatever) 
returned a simple identifier, say some opaque hex string of sorts like 
DEADBEEF. Now you want to get that image file and assign it as the source of 
an img (which is the whole point):

  img.src = file.url;

If your document is at http://deadbff.org/foo/ you've essentially made your 
image element link to http://deadbff.org/foo/DEADBEEF. That's not what you 
wanted.

Using a syntax (be it URI scheme or URN) that can naturally disambiguate 
between relative URI references and these magic references is, alas, needed.

-- 
Robin Berjon - http://berjon.com/

Re: Updates to File API

2010-05-20 Thread Nathan


Jonas Sicking wrote:

On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com wrote:

3. The renaming of the property to 'url' also suggests that we should
cease to consider an urn:uuid scheme.


I'm not sure that one follows from the other. The property's called 'url'
because that's what will be familiar to authors, but the magic string that
goes inside of it could still be a URN.


FWIW, I'm a developer and sticking a URN in a .url property really 
doesn't seem familiar at all - even a '.id' property with an id that was 
consistently generated would be much better.


If the scope of the identifiers is limited to a single ua, on a single 
machine, and specific to that single ua (as in I can't expect to request 
the identifier outside of the ua that provided it on x machine and get 
the same results) then I (personally) can't see why there's a need for 
anything more than a simple unique identifier (sha1 or suchlike)


And if the above is true, then surely this would negate the need for 
.url, registering a new URI scheme, or URN namespace - and all in save 
you all from lots of headaches  time wasted, close the issue, and save 
the developer community from years of further confusion (or should i say 
conflated understanding of what a URL is), and benefit the entire web by 
saving us from yet another (predominantly unneeded) URN namespace or URL 
scheme.


Best  leave this in your capable hands.

Nathan

Re: Updates to File API

2010-05-20 Thread Jonas Sicking

On Thu, May 20, 2010 at 2:53 PM, Nathan nat...@webr3.org wrote:
 Jonas Sicking wrote:

 On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com
 wrote:

 3. The renaming of the property to 'url' also suggests that we should
 cease to consider an urn:uuid scheme.

 I'm not sure that one follows from the other. The property's called
 'url'
 because that's what will be familiar to authors, but the magic string
 that
 goes inside of it could still be a URN.

 FWIW, I'm a developer and sticking a URN in a .url property really doesn't
 seem familiar at all - even a '.id' property with an id that was
 consistently generated would be much better.

 If the scope of the identifiers is limited to a single ua, on a single
 machine, and specific to that single ua (as in I can't expect to request the
 identifier outside of the ua that provided it on x machine and get the same
 results) then I (personally) can't see why there's a need for anything more
 than a simple unique identifier (sha1 or suchlike)

 And if the above is true, then surely this would negate the need for .url,
 registering a new URI scheme, or URN namespace - and all in save you all
 from lots of headaches  time wasted, close the issue, and save the
 developer community from years of further confusion (or should i say
 conflated understanding of what a URL is), and benefit the entire web by
 saving us from yet another (predominantly unneeded) URN namespace or URL
 scheme.

Note that the important point of these URNs isn't that they are
identifiers, but rather that you can point a iframe.src, or a
img.src, or a #myElement { background-url: url(...) } at them. In
all useful use cases brought up so far, the website author will never
look at the actual string to see what it contains, but rather just
treat it as a url and load data from it.

The intended use for it is things like:

img id=preview
input type=file onchange=document.getElementById('preview').src =
this.files[0].url

In this context, calling the string an identifier misses the point IMHO.

(btw, the above example should work fine in nightly firefox builds)

/ Jonas

Re: Updates to File API

2010-05-19 Thread Darin Fisher

On Tue, May 18, 2010 at 2:56 PM, Arun Ranganathan a...@mozilla.com wrote:

 On 5/18/10 2:35 PM, Eric Uhrhane wrote:

 On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org
  wrote:


 I have couple of questions, mostly clarifications I think:
 1. FileReader takes Blob but there are multiple hints that the blob
 should
 be actually a 'file'. As we see Blob concept grows in popularity with
 such
 specs as FileWriter that defines BlobBuilder. Other proposals include
 Image
 Resizing that returns a Blob with a compressed image data. Can all types
 of
 Blobs be 'read' using FileReader? If not, then it would be logical to
 only
 accept File parameter. If any type of Blob can be read (as I think the
 spirit of the spec assumes) then would it be less confusing to cange the
 name to BlobReader?


 I'd support that.  I think we always want to allow reading of any type
 of Blob--it's the interchange container--so calling it BlobReader
 makes sense.  Arun, how do you feel about that?



 The FileReader object accepts File objects for DataURL-reads, and Blob
 objects for binary string, text, and binary reads.  I agree that having a
 name like FileReader is generally a bit confusing, given that we do allow
 Blobs to be read, including Blobs which aren't directly coined from files.
  Blob itself isn't a great name, though it's a stand-in for Binary Large
 Object.

 Aside from the slight bikeshed-ish nature of this discussion, there are
 implementations in circulation that already use the name FileReader (e.g.
 Firefox 3.6.3).  This doesn't mean I'm against changing it, but I do wish
 the name change suggestion came earlier.  Also, I'm keen that the main
 object name addresses the initial use case -- reading file objects.  Perhaps
 in the future Blobs that are not files will be the norm; maybe then, Blob
 APIs will evolve, including implementations with ArrayBuffer and potential
 streaming use cases getting addressed better.

 Perhaps it is late to have a name change, and we've added to
 less-than-adequate naming on the Web (example: XMLHttpRequest).


It doesn't seem too late to change the name.  FF could support both
FileReader and BlobReader.  One could just be an alias for the other.  It
seems like we have situations like this frequently when it comes to new web
platform APIs.  A name only becomes immutable once there is a lot of content
using it since user agents would be compelled to support the existing name
for compat with existing content ;-)

-Darin




  Would FileWriter ever be used to write anything other than a File?  I
 think not, so it should probably stay as it is, despite the lack of
 symmetry.



 2. The FileReader.result is a string.



 Actually, in my next draft, I will have FileReader.result be of type 'any'
 (WebILD's 'any') since it could also be an ArrayBuffer (using the
 readAsBinary method, which will function like the other asynchrous read
 methods, but read into ArrayBuffers across the ProgressEvent spectrum.

 -- A*

Re: Updates to File API

2010-05-19 Thread J Ross Nicoll


On 19/05/10 08:00, Darin Fisher wrote:


It doesn't seem too late to change the name.  FF could support both
FileReader and BlobReader.  One could just be an alias for the other.
  It seems like we have situations like this frequently when it comes to
new web platform APIs.  A name only becomes immutable once there is a
lot of content using it since user agents would be compelled to support
the existing name for compat with existing content ;-)


I would agree with this; the name in the specification should be 
whatever makes most sense, not the name that has been used in a draft 
implementation. I also think it's perfectly reasonable to expect content 
using the draft API to be updated over a relatively short period of time 
(meaning I would not expect Firefox to handle both names for more than a 
year or two).

Re: Updates to File API

2010-05-19 Thread Robin Berjon

Hi Arun,

On May 13, 2010, at 14:27 , Arun Ranganathan wrote:
 I have updated the editor's draft of the File API to reflect changes that 
 have been in discussion.

Cool, thanks!

 ArrayBuffers, and affiliated Typed Array views of data, are specified in a 
 working draft as a part of the WebGL work [1].  This work has been proposed 
 to ECMA's TC-39 WG as well.  We intend to implement some of this in the 
 Firefox 4 timeframe, and have reason to believe other browsers will as well.  
 I have thus cited the work as a normative reference [1]

The TA draft doesn't include any copyright or licensing information. I take it 
that the plan is to eventually have it at some stable URL accessible to all and 
under an RF license?

 3. The renaming of the property to 'url' also suggests that we should cease 
 to consider an urn:uuid scheme.

I'm not sure that one follows from the other. The property's called 'url' 
because that's what will be familiar to authors, but the magic string that goes 
inside of it could still be a URN.

  I solicited implementer feedback about URLs vs. URNs in general.  There was 
 a general preference to URLs[5], though this wasn't a strong preference.   
 Moreover, Mozilla's implementation currently uses moz-filedata: .  The 
 current draft has an editor's note about the use of HTTP semantics, and 
 origin issues in the context of shared workers.  This is work in progress; I 
 have removed the section specifying urn:uuid and hope to have an update with 
 a section covering the filedata: scheme (with filedata:uuid as a suggestion). 
  I welcome discussion about this.  I'll point out that we are coining a new 
 scheme, which we originally sought to avoid :-)

I don't really have a strong preference, but I believe that registering a URN 
namespace (in the case where we would go for urn:file-data: instead of 
urn:uuid:) is easier than registering a URI scheme. Since I have a strong 
feeling that you'll be the one who'll end up doing that work, you might want to 
take that into consideration ;-) Implementation-wise I can see how some might 
have the plumbing in place to dispatch depending on URI schemes but not for 
URNs.

Unless someone has a strong feeling (i.e. not bikeshedding) on this I would 
suggest closing this issue and leaving it up to the editor.

 Is using a subset of HTTP response codes acceptable practice, or should we 
 forgo response codes in this specification?


That seems to risk getting you close to specifying the behaviour of file: :) 
The problem of forgoing response codes is that it breaks a number of libraries. 
For instance (IIRC), the following never calls you back: 
$.get(file:///foo.html, cb) because jQuery never detects a successful fetch 
of the file (even though the underlying XHR may have succeeded) — the same 
would apply to filedata:. A subset of HTTP has the downside that it should 
ideally be consistent. Maybe that can be done with brutality? Reject any method 
other than GET with a 405, return 400 for any header the author sets that 
involves conditional or negotiated responses, 404 if the URI doesn't exist, and 
200 if it does. Only set the response headers that match information already 
exposed on the Blob.

 Editorial note
 Issue: if it is determined that the type attribute is one of text/html, 
 text/xml, or application/xml then the specification should allow HTML5 
 [HTML5] parsing (creation of Document) or XML parsing specified in XML 
 specifications. Should there be normative text for this?


I'm not sure I follow the intent exactly here, do you mean adding something 
like readAsDocument()? That sounds nice (and ought to work for +xml types as 
well) but is it essential enough?

-- 
Robin Berjon - http://berjon.com/

Re: Updates to File API

2010-05-19 Thread Arun Ranganathan


Robin,

ArrayBuffers, and affiliated Typed Array views of data, are specified in a 
working draft as a part of the WebGL work [1].  This work has been proposed to ECMA's 
TC-39 WG as well.  We intend to implement some of this in the Firefox 4 timeframe, and 
have reason to believe other browsers will as well.  I have thus cited the work as a 
normative reference [1]
 

The TA draft doesn't include any copyright or licensing information. I take it 
that the plan is to eventually have it at some stable URL accessible to all and 
under an RF license?
   


Yes!  Technical details are currently being hashed out on 
es-disc...@mozilla.org (the general ECMAScript discussion forum).  I 
expect it to have a more formal home, and of course, an RF license.  
This is something we should fix in the short term as well, and I'll 
raise this through the WebGL WG.
   

3. The renaming of the property to 'url' also suggests that we should cease to 
consider an urn:uuid scheme.
 

I'm not sure that one follows from the other. The property's called 'url' 
because that's what will be familiar to authors, but the magic string that goes 
inside of it could still be a URN.
   


I agree that this is probably workable.  (And thanks for commenting on 
this issue :-) )


I don't really have a strong preference, but I believe that registering a URN 
namespace (in the case where we would go for urn:file-data: instead of 
urn:uuid:) is easier than registering a URI scheme. Since I have a strong 
feeling that you'll be the one who'll end up doing that work, you might want to 
take that into consideration ;-)


If we do go with a URN for the .url property, then I'm not sure what 
benefit is gained from registering a new URN namespace (since we could 
use urn:uuid:).  One advantage of using urn:uuid was that the new 
technology overhead was low.  At the moment, I'm torn on this, but I'll 
note that implementations are proceeding with what looks like a new 
scheme (or at least what could be a new URN namespace).


Again, implementor feedback is welcome, but the point you make below is 
what I think is true for other implementations (but not necessarily 
Firefox):



Implementation-wise I can see how some might have the plumbing in place to 
dispatch depending on URI schemes but not for URNs.
   


+1 (again, not true of Firefox, where it doesn't really make a difference).


Unless someone has a strong feeling (i.e. not bikeshedding) on this I would 
suggest closing this issue and leaving it up to the editor.

   


Thanks :)

Is using a subset of HTTP response codes acceptable practice, or should we 
forgo response codes in this specification?
 


That seems to risk getting you close to specifying the behaviour of file: :) The problem 
of forgoing response codes is that it breaks a number of libraries. For instance (IIRC), 
the following never calls you back: $.get(file:///foo.html, cb) because 
jQuery never detects a successful fetch of the file (even though the underlying XHR may 
have succeeded) — the same would apply to filedata:. A subset of HTTP has the downside 
that it should ideally be consistent. Maybe that can be done with brutality? Reject any 
method other than GET with a 405, return 400 for any header the author sets that involves 
conditional or negotiated responses, 404 if the URI doesn't exist, and 200 if it does. 
Only set the response headers that match information already exposed on the Blob.
   


All good suggestions.  I think the subset can be determined by 
researching what XHR is used for within file:///, which is how I'm 
currently proceeding.  I agree with GET + strict subset of responses.  
Information set on Blob is likely to only include Content-Type (for now).
   

Editorial note
Issue: if it is determined that the type attribute is one of text/html, 
text/xml, or application/xml then the specification should allow HTML5 [HTML5] 
parsing (creation of Document) or XML parsing specified in XML specifications. 
Should there be normative text for this?
 


I'm not sure I follow the intent exactly here, do you mean adding something 
like readAsDocument()? That sounds nice (and ought to work for +xml types as 
well) but is it essential enough?
   


User-agents do type determination on files, and if it is discovered that 
the file in question (or Blob) is an HTML file or an XML file, we should 
probably follow those rules.  I don't think we need a readAsDocument( ), 
since readAsText, which gives you a string, might be enough.


-- A*

Re: Updates to File API

2010-05-19 Thread Jonas Sicking

On Wed, May 19, 2010 at 1:09 PM, Arun Ranganathan a...@mozilla.com wrote:
 3. The renaming of the property to 'url' also suggests that we should
 cease to consider an urn:uuid scheme.


 I'm not sure that one follows from the other. The property's called 'url'
 because that's what will be familiar to authors, but the magic string that
 goes inside of it could still be a URN.


 I agree that this is probably workable.  (And thanks for commenting on this
 issue :-) )

I agree with Robin. We should definitely not get into defining things
with paths and stuff. I don't have a strong opinion about what the
scheme should be, but we definitely want it to be some sort of unique
identifier plus a prefix.

 I don't really have a strong preference, but I believe that registering a
 URN namespace (in the case where we would go for urn:file-data: instead of
 urn:uuid:) is easier than registering a URI scheme. Since I have a strong
 feeling that you'll be the one who'll end up doing that work, you might want
 to take that into consideration ;-)

 If we do go with a URN for the .url property, then I'm not sure what benefit
 is gained from registering a new URN namespace (since we could use
 urn:uuid:).  One advantage of using urn:uuid was that the new technology
 overhead was low.  At the moment, I'm torn on this, but I'll note that
 implementations are proceeding with what looks like a new scheme (or at
 least what could be a new URN namespace).

 Again, implementor feedback is welcome, but the point you make below is what
 I think is true for other implementations (but not necessarily Firefox):

For what it's worth, implementing a new scheme would be easier in
firefox too. However I don't care strongly as either solution is still
implementable.

 Implementation-wise I can see how some might have the plumbing in place to
 dispatch depending on URI schemes but not for URNs.


 +1 (again, not true of Firefox, where it doesn't really make a difference).

See above.

 Unless someone has a strong feeling (i.e. not bikeshedding) on this I
 would suggest closing this issue and leaving it up to the editor.

I agree.

/ Jonas

Re: Updates to File API

2010-05-18 Thread Eric Uhrhane

On Mon, May 17, 2010 at 3:37 PM, Dmitry Titov dim...@chromium.org wrote:
 I have couple of questions, mostly clarifications I think:
 1. FileReader takes Blob but there are multiple hints that the blob should
 be actually a 'file'. As we see Blob concept grows in popularity with such
 specs as FileWriter that defines BlobBuilder. Other proposals include Image
 Resizing that returns a Blob with a compressed image data. Can all types of
 Blobs be 'read' using FileReader? If not, then it would be logical to only
 accept File parameter. If any type of Blob can be read (as I think the
 spirit of the spec assumes) then would it be less confusing to cange the
 name to BlobReader?

I'd support that.  I think we always want to allow reading of any type
of Blob--it's the interchange container--so calling it BlobReader
makes sense.  Arun, how do you feel about that?

Would FileWriter ever be used to write anything other than a File?  I
think not, so it should probably stay as it is, despite the lack of
symmetry.

 2. The FileReader.result is a string. There could be useful cases where it
 could be useful to read the data as ArrayBuffer. For example, if a page
 tries to crack the JPG file to extract the EXIF metadata. Maybe returning a
 Blob that can later be asked for ArrayBuffer would be as good.

You're going to give a Blob to a FileReader, and get the same Blob back?

 Dmitry
 On Fri, May 14, 2010 at 11:52 AM, Arun Ranganathan a...@mozilla.com wrote:

 On 5/13/10 9:32 PM, Darin Fisher wrote:

 Glad to hear that you didn't intend sync access :-)


 I have thoughts on Blob and how it should behave (and about the
 inheritance relationship between Blob and File), which is why I left the
 unfortunate error in the editor's draft for now (commented out and
 caveated).  This is the subject of a separate email thread (but don't worry
 -- while my thoughts on Blob and ArrayBuffer may be in some flux, sync
 access to File objects is *always* going to be a no-no, I promise :-) ).

 Now aside from the Blob - ArrayBuffer relationship, which I introduced,
 the rest of the changes are in keeping with threads discussing the File API.

 Can you define the contentType parameter to slice better?  Is that
 intended
 to correspond to the value of a HTTP Content-Type response header?  For
 example, can the contentType value include a charset attribute?  It might
 be
 useful to indicate that a slice of a file should be treated as text/html
 with a specific encoding.



 I'm happy to define it better in terms of what it *should* be, but web
 developers are likely to use it in ways that we can't predict, which is why
 forcing Content-Types is useful, but weird.  Why exactly do you mean when
 you say that a slice of a file should be treated as text/html with a
 specific encoding?  Can you give me a use case that illustrates why this is
 a good way to define this?

 I'm also a fan of providing a way to specify optional
 Content-Disposition
 parameters in the slice call.

 So I'm really not a Content-Disposition fan, since all the use cases I've
 seen so far seem to be to force download behavior (or trigger Download
 Manager).  Is there something I'm missing -- e.g. is there something here
 that FileWriter or BlobBuilder do *not* address, that putting
 Content-Disposition on Blob URLs *does* address?  Sorry if I'm missing
 something obvious.

 -- A*

Re: Updates to File API

2010-05-18 Thread Eric Uhrhane

On Fri, May 14, 2010 at 11:52 AM, Arun Ranganathan a...@mozilla.com wrote:
 On 5/13/10 9:32 PM, Darin Fisher wrote:

 Glad to hear that you didn't intend sync access :-)


 I have thoughts on Blob and how it should behave (and about the inheritance
 relationship between Blob and File), which is why I left the unfortunate
 error in the editor's draft for now (commented out and caveated).  This is
 the subject of a separate email thread (but don't worry -- while my thoughts
 on Blob and ArrayBuffer may be in some flux, sync access to File objects is
 *always* going to be a no-no, I promise :-) ).

 Now aside from the Blob - ArrayBuffer relationship, which I introduced, the
 rest of the changes are in keeping with threads discussing the File API.

 Can you define the contentType parameter to slice better?  Is that
 intended
 to correspond to the value of a HTTP Content-Type response header?  For
 example, can the contentType value include a charset attribute?  It might
 be
 useful to indicate that a slice of a file should be treated as text/html
 with a specific encoding.



 I'm happy to define it better in terms of what it *should* be, but web
 developers are likely to use it in ways that we can't predict, which is why
 forcing Content-Types is useful, but weird.  Why exactly do you mean when
 you say that a slice of a file should be treated as text/html with a
 specific encoding?  Can you give me a use case that illustrates why this is
 a good way to define this?

I can't speak for Darin, but I'd think the same reasoning that applies
whenever a server adds those headers via HTTP should apply whenever a
client-side app wants to add them to a Blob.url.

 I'm also a fan of providing a way to specify optional
 Content-Disposition
 parameters in the slice call.

 So I'm really not a Content-Disposition fan, since all the use cases I've
 seen so far seem to be to force download behavior (or trigger Download
 Manager).  Is there something I'm missing -- e.g. is there something here
 that FileWriter or BlobBuilder do *not* address, that putting
 Content-Disposition on Blob URLs *does* address?  Sorry if I'm missing
 something obvious.

It is indeed generally intended to trigger Download Manager.  If you
take a look at my use case at [1], the idea is to give web developers
a facility that's just like the one they're already using, so that
anything they do with URLs for files online they can also do with URLs
for Blobs offline/client-side.

The FileWriter spec's a bit up in the air over the same issue; I
haven't yet specced a good way for FileWriter to solve this problem,
so it's hard to say it's going to handle it better.

 Eric

[1] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0412.html

Re: Updates to File API

2010-05-18 Thread Arun Ranganathan


On 5/18/10 2:35 PM, Eric Uhrhane wrote:

On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org  wrote:
   

I have couple of questions, mostly clarifications I think:
1. FileReader takes Blob but there are multiple hints that the blob should
be actually a 'file'. As we see Blob concept grows in popularity with such
specs as FileWriter that defines BlobBuilder. Other proposals include Image
Resizing that returns a Blob with a compressed image data. Can all types of
Blobs be 'read' using FileReader? If not, then it would be logical to only
accept File parameter. If any type of Blob can be read (as I think the
spirit of the spec assumes) then would it be less confusing to cange the
name to BlobReader?
 

I'd support that.  I think we always want to allow reading of any type
of Blob--it's the interchange container--so calling it BlobReader
makes sense.  Arun, how do you feel about that?
   


The FileReader object accepts File objects for DataURL-reads, and Blob 
objects for binary string, text, and binary reads.  I agree that having 
a name like FileReader is generally a bit confusing, given that we do 
allow Blobs to be read, including Blobs which aren't directly coined 
from files.  Blob itself isn't a great name, though it's a stand-in for 
Binary Large Object.


Aside from the slight bikeshed-ish nature of this discussion, there are 
implementations in circulation that already use the name FileReader 
(e.g. Firefox 3.6.3).  This doesn't mean I'm against changing it, but I 
do wish the name change suggestion came earlier.  Also, I'm keen that 
the main object name addresses the initial use case -- reading file 
objects.  Perhaps in the future Blobs that are not files will be the 
norm; maybe then, Blob APIs will evolve, including implementations with 
ArrayBuffer and potential streaming use cases getting addressed better.


Perhaps it is late to have a name change, and we've added to 
less-than-adequate naming on the Web (example: XMLHttpRequest).



Would FileWriter ever be used to write anything other than a File?  I
think not, so it should probably stay as it is, despite the lack of
symmetry.

   

2. The FileReader.result is a string.
 


Actually, in my next draft, I will have FileReader.result be of type 
'any' (WebILD's 'any') since it could also be an ArrayBuffer (using the 
readAsBinary method, which will function like the other asynchrous read 
methods, but read into ArrayBuffers across the ProgressEvent spectrum.


-- A*

Re: Updates to File API

2010-05-18 Thread Dmitry Titov

On Tue, May 18, 2010 at 2:56 PM, Arun Ranganathan a...@mozilla.com wrote:

 On 5/18/10 2:35 PM, Eric Uhrhane wrote:

 On Mon, May 17, 2010 at 3:37 PM, Dmitry Titovdim...@chromium.org
  wrote:


 I have couple of questions, mostly clarifications I think:
 1. FileReader takes Blob but there are multiple hints that the blob
 should
 be actually a 'file'. As we see Blob concept grows in popularity with
 such
 specs as FileWriter that defines BlobBuilder. Other proposals include
 Image
 Resizing that returns a Blob with a compressed image data. Can all types
 of
 Blobs be 'read' using FileReader? If not, then it would be logical to
 only
 accept File parameter. If any type of Blob can be read (as I think the
 spirit of the spec assumes) then would it be less confusing to cange the
 name to BlobReader?


 I'd support that.  I think we always want to allow reading of any type
 of Blob--it's the interchange container--so calling it BlobReader
 makes sense.  Arun, how do you feel about that?



 The FileReader object accepts File objects for DataURL-reads, and Blob
 objects for binary string, text, and binary reads.  I agree that having a
 name like FileReader is generally a bit confusing, given that we do allow
 Blobs to be read, including Blobs which aren't directly coined from files.
  Blob itself isn't a great name, though it's a stand-in for Binary Large
 Object.

 Aside from the slight bikeshed-ish nature of this discussion, there are
 implementations in circulation that already use the name FileReader (e.g.
 Firefox 3.6.3).  This doesn't mean I'm against changing it, but I do wish
 the name change suggestion came earlier.  Also, I'm keen that the main
 object name addresses the initial use case -- reading file objects.  Perhaps
 in the future Blobs that are not files will be the norm; maybe then, Blob
 APIs will evolve, including implementations with ArrayBuffer and potential
 streaming use cases getting addressed better.

 Perhaps it is late to have a name change, and we've added to
 less-than-adequate naming on the Web (example: XMLHttpRequest).


Ok, I can see how it can be late if FF already shipped it... Perhaps the
spec could at least avoid using 'fileBlob' as names of arguments, since the
naming currently may be interpreted as  if only file-backed blobs are
welcome :-)




  Would FileWriter ever be used to write anything other than a File?  I
 think not, so it should probably stay as it is, despite the lack of
 symmetry.



 2. The FileReader.result is a string.



 Actually, in my next draft, I will have FileReader.result be of type 'any'
 (WebILD's 'any') since it could also be an ArrayBuffer (using the
 readAsBinary method, which will function like the other asynchrous read
 methods, but read into ArrayBuffers across the ProgressEvent spectrum.


Getting an ArrayBuffer on each ProgressEvent could be a cool idea indeed. I
guess when we have ArrayBuffers we'll be able to use them in BlobBuilder as
well.



 -- A*

Re: Updates to File API

2010-05-14 Thread Arun Ranganathan


On 5/13/10 9:32 PM, Darin Fisher wrote:

Glad to hear that you didn't intend sync access :-)
   


I have thoughts on Blob and how it should behave (and about the 
inheritance relationship between Blob and File), which is why I left the 
unfortunate error in the editor's draft for now (commented out and 
caveated).  This is the subject of a separate email thread (but don't 
worry -- while my thoughts on Blob and ArrayBuffer may be in some flux, 
sync access to File objects is *always* going to be a no-no, I promise 
:-) ).


Now aside from the Blob - ArrayBuffer relationship, which I introduced, 
the rest of the changes are in keeping with threads discussing the File API.



Can you define the contentType parameter to slice better?  Is that intended
to correspond to the value of a HTTP Content-Type response header?  For
example, can the contentType value include a charset attribute?  It might be
useful to indicate that a slice of a file should be treated as text/html
with a specific encoding.

   


I'm happy to define it better in terms of what it *should* be, but web 
developers are likely to use it in ways that we can't predict, which is 
why forcing Content-Types is useful, but weird.  Why exactly do you 
mean when you say that a slice of a file should be treated as text/html 
with a specific encoding?  Can you give me a use case that illustrates 
why this is a good way to define this?



I'm also a fan of providing a way to specify optional Content-Disposition
parameters in the slice call.


So I'm really not a Content-Disposition fan, since all the use cases 
I've seen so far seem to be to force download behavior (or trigger 
Download Manager).  Is there something I'm missing -- e.g. is there 
something here that FileWriter or BlobBuilder do *not* address, that 
putting Content-Disposition on Blob URLs *does* address?  Sorry if I'm 
missing something obvious.


-- A*

Updates to File API

2010-05-13 Thread Arun Ranganathan

Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes
that have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an
ArrayBuffer property that represents the Blob. ArrayBuffers, and
affiliated Typed Array views of data, are specified in a working draft
as a part of the WebGL work [1]. This work has been proposed to ECMA's
TC-39 WG as well. We intend to implement some of this in the Firefox 4
timeframe, and have reason to believe other browsers will as well. I
have thus cited the work as a normative reference [1]. Eventually, we
ought to consider further read operations given ArrayBuffers, but for
now, I believe exposing Blobs in this way is sufficient.

2. url and type properties have been moved to to the underlying Blob
interface. Notably, the property is now called 'url' and not 'urn.'
Use cases for triggering 'save as' behavior with Content-Disposition
have not been addressed[2], although I believe that with FileWriter and
BlobBuilder[3] they may be addressed differently. This change reflects
lengthy discussion (e.g. start here[4])

3. The renaming of the property to 'url' also suggests that we should
cease to consider an urn:uuid scheme. I solicited implementer feedback
about URLs vs. URNs in general. There was a general preference to
URLs[5], though this wasn't a strong preference. Moreover, Mozilla's
implementation currently uses moz-filedata: . The current draft has an
editor's note about the use of HTTP semantics, and origin issues in the
context of shared workers. This is work in progress; I have removed the
section specifying urn:uuid and hope to have an update with a section
covering the filedata: scheme (with filedata:uuid as a suggestion). I
welcome discussion about this. I'll point out that we are coining a new
scheme, which we originally sought to avoid :-)

4. I have changed event order; loadend now fires after an error event [6].

-- A*

[1]
https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html

[2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html
[3] http://dev.w3.org/2009/dap/file-system/file-writer.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html
[6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html

Re: Updates to File API

2010-05-13 Thread Arun Ranganathan


On 5/13/10 7:37 AM, David Levin wrote:

On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com  wrote:

   

Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes that
have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an
ArrayBuffer property that represents the Blob.
 


Does this imply *sync* access to the blob data? new
DataArray(blob.blobBuffer).getInt8(0);
   


Sync. access to a Blob shouldn't be allowed; this is a *big* oversight 
on my part, and I think how the property is exposed should be considered 
better.

Also, does it imply the ability to modify the blob contents? (If so, what
does this mean when there is a file backing it?)
new DataArray(blob.blobBuffer).setInt8(0, 0);
   


This is part of the same oversight (evident in the editor's draft).

I think this aspect of things should be left to BlobBuilder or FileWriter.

-- A*


Thanks, dave

Re: Updates to File API

2010-05-13 Thread Arun Ranganathan


On 5/13/10 7:37 AM, David Levin wrote:

On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com  wrote:

   

Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes that
have been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an
ArrayBuffer property that represents the Blob.
 


Does this imply *sync* access to the blob data? new
DataArray(blob.blobBuffer).getInt8(0);
   


A more sensible way is an additional asynchronous read method on 
FileReader, which is what I should have done in the first place.  Here, 
partial data is going to be an interesting question.  While partial 
strings makes sense (for readAsBinaryString and readAsText), partial 
ArrayBuffers gets us into a different area altogether.  Any thoughts on 
partial reads here?


For now, I've caveated my (pretty major) mistake with an editor's note.  
I'll update later today with a better way to expose this, but I'm 
thinking something like readAsArrayBuffer on FileReader (with an open 
question on partial reads).

Also, does it imply the ability to modify the blob contents? (If so, what
does this mean when there is a file backing it?)
new DataArray(blob.blobBuffer).setInt8(0, 0);
   


I'll let Eric speak to what BlobBuilder might want to do, but I'll 
strongly disallow it in my draft :)


-- A*

Thanks, dave

Re: Updates to File API

2010-05-13 Thread J Ross Nicoll

On 13 May 2010, at 13:27, Arun Ranganathan wrote:

 Greetings WebApps WG,
 
 I have updated the editor's draft of the File API to reflect changes that 
 have been in discussion.
 
 http://dev.w3.org/2006/webapi/FileAPI
 
 Notably:
 
 1. Blobs now allow further binary data operations by exposing an ArrayBuffer 
 property that represents the Blob.  ArrayBuffers, and affiliated Typed 
 Array views of data, are specified in a working draft as a part of the WebGL 
 work [1].  This work has been proposed to ECMA's TC-39 WG as well.  We intend 
 to implement some of this in the Firefox 4 timeframe, and have reason to 
 believe other browsers will as well.  I have thus cited the work as a 
 normative reference [1].  Eventually, we ought to consider further read 
 operations given ArrayBuffers, but for now, I believe exposing Blobs in this 
 way is sufficient.

Why remove the 'type' attribute from the File? Specifically, is there a real 
issue with duplicating the information in both the File and the Blob? Two main 
concerns:

Without 'type' in the File attribute, you have to read the file to understand 
what's in it. This means that if you want to, for example, produce a 
confirmation dialogue for the user before reading a file, it's very limited in 
how much information it can show (I also think it would be a good idea to have 
'size' as an attribute on the File, for related reasons).

At the moment, if a directory is been dragged and dropped into Firefox, the 
only way to spot this appears to be the 'type' attribute (which is empty). 
Unless I'm missing something, as written this would appear to mean the JS has 
to try reading a directory, get a Blob back (which Firefox does do, at least) 
and then it can find out it didn't actually read a file.



Looking at synchronized file reading... would it perhaps make more sense to 
have readAsBinaryString(), readAsText() and readAsDataURL() as methods on the 
File, rather than a specific separate interface (FileReaderSync)?

Re: Updates to File API

2010-05-13 Thread Jonas Sicking

On Thu, May 13, 2010 at 1:50 PM, J Ross Nicoll j...@jrn.me.uk wrote:
 On 13 May 2010, at 13:27, Arun Ranganathan wrote:

 Greetings WebApps WG,

 I have updated the editor's draft of the File API to reflect changes that 
 have been in discussion.

 http://dev.w3.org/2006/webapi/FileAPI

 Notably:

 1. Blobs now allow further binary data operations by exposing an ArrayBuffer 
 property that represents the Blob.  ArrayBuffers, and affiliated Typed 
 Array views of data, are specified in a working draft as a part of the 
 WebGL work [1].  This work has been proposed to ECMA's TC-39 WG as well.  We 
 intend to implement some of this in the Firefox 4 timeframe, and have reason 
 to believe other browsers will as well.  I have thus cited the work as a 
 normative reference [1].  Eventually, we ought to consider further read 
 operations given ArrayBuffers, but for now, I believe exposing Blobs in this 
 way is sufficient.

 Why remove the 'type' attribute from the File? Specifically, is there a real 
 issue with duplicating the information in both the File and the Blob? Two 
 main concerns:

File inherits Blob, so everything that is available on Blob is
available on File. This is similar to how HTMLElement inherits
Element. getAttribute is available on HTMLElement, despite being
defined on Element.

/ Jonas

Re: Updates to File API

2010-05-13 Thread Arun Ranganathan


On 5/13/10 1:50 PM, J Ross Nicoll wrote:

On 13 May 2010, at 13:27, Arun Ranganathan wrote:

   

Greetings WebApps WG,

I have updated the editor's draft of the File API to reflect changes that have 
been in discussion.

http://dev.w3.org/2006/webapi/FileAPI

Notably:

1. Blobs now allow further binary data operations by exposing an ArrayBuffer property 
that represents the Blob.  ArrayBuffers, and affiliated Typed Array views of 
data, are specified in a working draft as a part of the WebGL work [1].  This work has 
been proposed to ECMA's TC-39 WG as well.  We intend to implement some of this in the 
Firefox 4 timeframe, and have reason to believe other browsers will as well.  I have thus 
cited the work as a normative reference [1].  Eventually, we ought to consider further 
read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is 
sufficient.
 

Why remove the 'type' attribute from the File? Specifically, is there a real 
issue with duplicating the information in both the File and the Blob? Two main 
concerns:

Without 'type' in the File attribute, you have to read the file to understand 
what's in it. This means that if you want to, for example, produce a 
confirmation dialogue for the user before reading a file, it's very limited in 
how much information it can show (I also think it would be a good idea to have 
'size' as an attribute on the File, for related reasons).

At the moment, if a directory is been dragged and dropped into Firefox, the 
only way to spot this appears to be the 'type' attribute (which is empty). 
Unless I'm missing something, as written this would appear to mean the JS has 
to try reading a directory, get a Blob back (which Firefox does do, at least) 
and then it can find out it didn't actually read a file.
   


Currently, File inherits from Blob.



Looking at synchronized file reading... would it perhaps make more sense to 
have readAsBinaryString(), readAsText() and readAsDataURL() as methods on the 
File, rather than a specific separate interface (FileReaderSync)?

   


FileReader is for asynchronous reads on the main thread.  FileReaderSync 
is for synchronous reads on worker threads.  We want to:


1. Decouple Files from the objects that read from them and
2. Disallow any synchronous File I/O on the main thread.

-- A*

Re: Updates to File API

2010-05-13 Thread Darin Fisher

Glad to hear that you didn't intend sync access :-)

Can you define the contentType parameter to slice better?  Is that intended
to correspond to the value of a HTTP Content-Type response header?  For
example, can the contentType value include a charset attribute?  It might be
useful to indicate that a slice of a file should be treated as text/html
with a specific encoding.

I'm also a fan of providing a way to specify optional Content-Disposition
parameters in the slice call.  It seems to me that Content-Disposition like
Content-Type impacts the way that Blob.url might be interpreted.  It is
useful to enable Blob.url to be able to replicate what you can do with
http:// URLs.  I think this would make it easier for apps to use
http://URLs while online and Blob.url while offline without changing
the rest of
their code.  I'm specifically thinking of use cases like the download
links for attachments in webmail apps.

Regards,
-Darin


On Thu, May 13, 2010 at 8:21 AM, Arun Ranganathan a...@mozilla.com wrote:

 On 5/13/10 7:37 AM, David Levin wrote:

 On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathana...@mozilla.com
  wrote:



 Greetings WebApps WG,

 I have updated the editor's draft of the File API to reflect changes that
 have been in discussion.

 http://dev.w3.org/2006/webapi/FileAPI

 Notably:

 1. Blobs now allow further binary data operations by exposing an
 ArrayBuffer property that represents the Blob.



 Does this imply *sync* access to the blob data? new
 DataArray(blob.blobBuffer).getInt8(0);



 A more sensible way is an additional asynchronous read method on
 FileReader, which is what I should have done in the first place.  Here,
 partial data is going to be an interesting question.  While partial strings
 makes sense (for readAsBinaryString and readAsText), partial ArrayBuffers
 gets us into a different area altogether.  Any thoughts on partial reads
 here?

 For now, I've caveated my (pretty major) mistake with an editor's note.
  I'll update later today with a better way to expose this, but I'm thinking
 something like readAsArrayBuffer on FileReader (with an open question on
 partial reads).

  Also, does it imply the ability to modify the blob contents? (If so, what
 does this mean when there is a file backing it?)
 new DataArray(blob.blobBuffer).setInt8(0, 0);



 I'll let Eric speak to what BlobBuilder might want to do, but I'll strongly
 disallow it in my draft :)

 -- A*

 Thanks, dave

49 matches

Mail list logo