Re: Blob as URN was Re: [FileAPI] Latest Revision of Editor's Draft

2009-11-18 Thread Eric Uhrhane
On Tue, Nov 17, 2009 at 6:25 PM, Arun Ranganathan a...@mozilla.com wrote:
 Eric,


    I recall you saying at TPAC that you wanted to keep the Blob
 interface as small as possible, since it seemed likely to get used in
 a lot of places.  I think that's an excellent goal, but of course,
 having said that, I am immediately going to suggest that you add
 something to it.



 I'm definitely not averse to additions :)  Actually, I'm also following the
 discussion initiated by Maciej about BinaryData [2] with
 public-script-co...@w3.org.  Eventually, I believe ECMAScript will provide a
 Binary primitive (perhaps ByteArray), and I think Blob should expose that
 primitive.  This would be a natural extension of what I envision Blob to be
 used for.  I also envision natural streaming extensions to this API.

   How would you feel about exposing a way to produce a URN from a
 Blob, instead of just getting one from a File?

 I'm not averse to it.  In fact, it was originally in the Blob interface
 (which at that time, was dubbed FileData).  We moved it to the File
 interface since the understanding of use cases at the time was that  all
 URL consumers expect a full file. [3]   You've provided use cases that show
 that this isn't the case, and so we should revisit our earlier
 understanding.  More on your use cases below:

 This seems likely to
 have wide-ranging uses.  Pretty much anywhere you have a blob of data,
 you might want to hand it off to the browser, even if it didn't come
 from, or wasn't, a single user-supplied file.  Here are a few use
 cases, but I'm sure more won't be hard to come up with:

 * Viewing a single chapter of a book in a frame.
 * Slicing one episode out of a DVD and handing it to the video tag, so
 that the player controls start and end at the episode boundaries.
 * Analogous to the game-asset archive I mentioned at [1], one might
 pack a number of small files together to speed download [using only
 HTTP compression], then parse them apart on the client.  Picture a
 Picasa client written in the web browser; it's got to handle maybe
 1+ thumbnails, and putting each in a separate file would be
 terribly inefficient.  Pulling down a tarfile would be a lot quicker.



 I can understand why you'd want *partial* data exposed through a URL, and
 why your API may force a type on the partial data.  Question: would a
 fragment identifier scheme [4] address any of these use cases, or is this
 completely orthogonal to the use cases you envision?  I ask because you
 envision a chapter within a frame but I'm not sure what the frame data
 structure is.

Ah--I think I wasn't clear here.  I just pictured taking a file,
chopping out a chapter by, say, byte offset, and writing it into an
iframe for display.  I don't think a fragment identifier is powerful
enough, even if we spec out how to overload it.  For example, say you
wanted to chop a chapter out of an HTML document *and* scroll your
iframe to a specific page of that chapter?

One can also imagine use cases in which the Blob is constructed
completely from scratch by JavaScript, in which case there's no File
at all.

 it
 would seem natural to me do to something like this:

    interface Blob {
      ...
      DOMString getURN(in DOMString mediaType,
           [Optional] in DOMString contentDisposition,
           [Optional] in DOMString name);
    };

 Given that a File that one gets from the user will still tell you its
 name and detected mediaType, and can have a constant urn, there seems
 to be no conflict in leaving the File interface as-is and adding
 something like getURN to Blob.  On the off chance that you want to
 override the detected mediaType for a file, force a contentDisposition
 of attachment, or change the name, you might still use getURN there as
 well.


 To be clear: you want the File object's URN capabilities to inherit from
 Blob, and not be separate, correct?  Thus, each Blob has an affiliated URN,
 and when a Blob is a File, it uses the Blob's getURN method?

No, I picture both being there, although I'm open to discussion about
it.  The getURN method forces the user to specify how the data is to
be presented to the UA via the URN created.  In contrast, the urn
property of File is inherently based on what the UA knows about the
file, and the user need not specify anything in order to use it.

Hmm...I guess there's no reason that we couldn't make the mediaType
parameter optional as well, so that in the case of a File getURN could
just return what the UA thought was appropriate.  That makes me
nervous for Blob data, though...perhaps we should spec it to throw in
that case?

 Can you explain what a contentDisposition is a bit better?  Can you write
 some psuedo-code showing how contentDisposition is used, perhaps to flesh
 out the above use cases?

Sure.  Let's say that you're writing an offline version of GMail.
You've got a File referring to an image attachment, and you want to
offer links through which the user can either view 

Blob as URN was Re: [FileAPI] Latest Revision of Editor's Draft

2009-11-17 Thread Arun Ranganathan

Eric,



I recall you saying at TPAC that you wanted to keep the Blob
interface as small as possible, since it seemed likely to get used in
a lot of places.  I think that's an excellent goal, but of course,
having said that, I am immediately going to suggest that you add
something to it.
  



I'm definitely not averse to additions :)  Actually, I'm also following 
the discussion initiated by Maciej about BinaryData [2] with 
public-script-co...@w3.org.  Eventually, I believe ECMAScript will 
provide a Binary primitive (perhaps ByteArray), and I think Blob should 
expose that primitive.  This would be a natural extension of what I 
envision Blob to be used for.  I also envision natural streaming 
extensions to this API.



   How would you feel about exposing a way to produce a URN from a
Blob, instead of just getting one from a File?  


I'm not averse to it.  In fact, it was originally in the Blob interface 
(which at that time, was dubbed FileData).  We moved it to the File 
interface since the understanding of use cases at the time was that  
all URL consumers expect a full file. [3]   You've provided use cases 
that show that this isn't the case, and so we should revisit our earlier 
understanding.  More on your use cases below:



This seems likely to
have wide-ranging uses.  Pretty much anywhere you have a blob of data,
you might want to hand it off to the browser, even if it didn't come
from, or wasn't, a single user-supplied file.  Here are a few use
cases, but I'm sure more won't be hard to come up with:

* Viewing a single chapter of a book in a frame.
* Slicing one episode out of a DVD and handing it to the video tag, so
that the player controls start and end at the episode boundaries.
* Analogous to the game-asset archive I mentioned at [1], one might
pack a number of small files together to speed download [using only
HTTP compression], then parse them apart on the client.  Picture a
Picasa client written in the web browser; it's got to handle maybe
1+ thumbnails, and putting each in a separate file would be
terribly inefficient.  Pulling down a tarfile would be a lot quicker.

  


I can understand why you'd want *partial* data exposed through a URL, 
and why your API may force a type on the partial data.  Question: 
would a fragment identifier scheme [4] address any of these use cases, 
or is this completely orthogonal to the use cases you envision?  I ask 
because you envision a chapter within a frame but I'm not sure what 
the frame data structure is.



it
would seem natural to me do to something like this:

interface Blob {
  ...
  DOMString getURN(in DOMString mediaType,
   [Optional] in DOMString contentDisposition,
   [Optional] in DOMString name);
};

Given that a File that one gets from the user will still tell you its
name and detected mediaType, and can have a constant urn, there seems
to be no conflict in leaving the File interface as-is and adding
something like getURN to Blob.  On the off chance that you want to
override the detected mediaType for a file, force a contentDisposition
of attachment, or change the name, you might still use getURN there as
well.
  


To be clear: you want the File object's URN capabilities to inherit from 
Blob, and not be separate, correct?  Thus, each Blob has an affiliated 
URN, and when a Blob is a File, it uses the Blob's getURN method?


Can you explain what a contentDisposition is a bit better?  Can you 
write some psuedo-code showing how contentDisposition is used, perhaps 
to flesh out the above use cases?


-- A*

[1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0424.html\
  


[2] 
http://lists.w3.org/Archives/Public/public-script-coord/2009OctDec/0093.html


[3] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0609.html

[4] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0587.html