Arun wrote:
There is lots that is attractive about InputStream, and I think that
it can be used in other specifications, especially when discussing
Camera APIs, streaming from web apps (conferencing) etc. I also like
the idea of DataHandler. When we define a byte primitive, it can be
used in conjunction with the stream interface. For additional read
features (fseek) this is also useful. I also appreciate that you
have pointed out in a subsequent email [1] that it is possible to
"sidestep the issue of dealing with bytes directly." Managing bytes
properly, with the right primitives, is one reason why, despite
having looked at the Java I/O APIs[2], I went with something
simpler. I think that we should have streams at some point, and I'm
amenable to looking at them in a subsequent iteration of the File
API. It's worth saying here that the appeal of streams is for
*multiple use cases* for both File API and other APIs, and *not*
because the Java I/O model is one we should emulate. Programmer
taste and choice about coining APIs is subjective.
Nikunj wrote in response:
I respect your point on taste, however, I am more interested in
composability than the maturity of Java I/O.
Firstly, what Jonas proposed as the Alternative File API [1] uses an
event model to address use cases such as progress feedback and
separating reading from file objects. I expressed reservations about
complexity, but saw more posts in favor of it than against it. This
model has advantages that come with an event model (separate
notifications like onprogress, onerror, allowing specific 'isolated'
code, etc) along with a signature similarity to XHR (which developers
are familiar with). My caveats about the model were mainly about
understanding trade-offs. I'm reconciled to having a v1 of the File API
specification based on Jonas' proposal (hopefully in good shape by the
upcoming TPAC), and I believe we can iterate from there.
It would be useful to see how you meet the following requirements:
1. incremental reading of a file's data
The proposal [1] reuses the FileData interface, which will still support
a slice(offset, length) method that returns another FileData object
within stipulated byte ranges. I hope to flesh out what happens under
range mathematics errors a bit more clearly (e.g. whether an exception
is raised). Along with progress events, I think this use case is addressed.
2. concurrent access to file data
(Note that "FileRequest" and "FileReader" are used interchangably in
[1]; I personally prefer FileReader as a name). Nothing precludes
multiple FileReader objects from accessing the same file, but not all
implementations need fire notifications (events) concurrently. Do you
have a specific use case in mind?
3. access to all file metadata without needing to read the file
(Note that in FileRequest, which I think should be named FileReader, the
read* methods take File objects as parameters, although the email
proposal [1] says that they take FileData objects. Jonas means File
objects).
The answer to your question depends on what you mean by *all* file
metadata.
File objects (which inherit from FileData objects) expose name and
mediaType properties, along with size (from FileData). But, suppose you
wanted ID3 information from an MP3 file. In this case (assuming ID3v1
usage), you would *have* to read the file, and look for the 128 byte
chunk beginning with TAG. This can be done in two ways:
i. Using splice() and range mathematics based on the file's size to get
to the end of the file and look at the last 128 bits of it as a separate
FileData object (since ID3v1 puts stuff at the end). Not ideal.
ii. Using read methods and working with the file format. Again, not
dripping with syntactic sugar, but certainly feasible.
I agree that metadata extraction could be made better, but I think that
I'm happy with what the existing proposal has. I also don't see how any
other proposal improves on this, even if you read into a stream buffer.
I am happy with the existing metadata extraction for a v1, and believe
that as we work out more audio and video issues on the platform, we can
get to specific metadata issues. Can you clear up what you mean by "all
metadata?"
4. separation of error handling from file reading
In Jonas' proposal, this isn't done cleanly (for some definition of
"clean" as separate from the reader object), but I think what *is* done
is good for the majority of use cases. In Jonas' proposal, the
FileReader object (named "FileRequest" in the email [1]) allows separate
onerror handling (along with onprogress being separate, etc.). It's not
done *within* a read method (unlike the existing proposal, which does
this less well than Jonas' proposal), and the callback that handles the
event can deal with the response.
This is as separate as is done with XHR.
All things being equal, I would prefer a model that, in order of
priority:
1. involves fewer steps, and
Me too! But, *both* your model and Jonas' model don't involve fewer
steps than the original proposal :) Jonas' model adds necessary
complexity for the major use case (onprogress) and for an event model.
2. evolves nicely with file write and binary access, which are both
likely to be next evolution directions in this area.
Agreed, but again, much of what you mean by "evolves nicely" is a
question of programmer taste. For instance, I think that readAsBinary
can be introduced on the FileReader object, in addition to
readAsBinaryString. Furthermore, I maintain that your streams proposal
can evolve later, and doesn't prevent us from proceeding with the
alternative File API proposal as what is in the draft [1].
Can you provide a comparison of your proposed approach with my
proposal for the above so that the WG can develop an informed opinion
about the proposals?
I *think* I've done this in answering the questions above.
For a first version (which should replace
http://www.w3.org/TR/file-upload/ , with a more meaningful name like
"File API"), I think we should address use cases around reads. Ian
Fette has given us plenty of other uses cases for consideration later
on[3]. While my editor's draft strove to address the use cases for
file access with different asynchronous data accessors, it was clear
that it couldn't gracefully account for progress events. Moreover,
general feedback favored a model that used events with a separate
reader object that allowed for progress events, and Jonas'
alternative proposal does this as well as resembles XHR [4]. While
I'm reluctant to sacrifice simplicity, I think moving in the
direction of the "Alternative File API"[4] reconciles use cases such
as progress events with calls for a reader/event model. FWIW, I
disagree that resemblance to XHR should be seen as "unwanted baggage"
[5]. I think it's desirable to resemble an API that has such
widespread usage!
This is arguable at best, since it seems to be an opinion not shared
by everyone, especially not the editor of XMLHttpRequest [1].
There are two things here that you may be confusing! Anne (the editor
of the XHR2 draft) expressed support for a model based on events [2].
What he is against is "abusing XHR" by using the URL attribute of a File
object as part of request [3]. I disagree with his stance on this, but
that is a bridge that we'll cross later, after we sort out details of
the FileData URL.
In fact, there is no similarity to XHR in the current editor's draft,
and I wonder why those benefits were considered unimportant when
drafting previously.
Note: the "benefits" I considered important centered on simplicity. But
others have argued in favor of a more robust model that gives us
progress events that is not simply another callback on the existing
proposal [4]. I expressed my support for simplicity [4] but also my
willingness to draft a spec. based on the alterative API. So far, only
you are arguing *against* it, but I don't believe that the alternative
approach blocks consideration of a stream-based approach later on.
While the web is inconsistent, event models are widely used, and
similarity between XHR and File API, which will be used in
conjunction anyway, is probably a good thing.
Can you explain in light of the objections I raised in [2], why the
"Alternative File API" is the right approach. I haven't seen any
replies to my points.
I'm happy to provide more details on anything I've answered here.
-- A*
[1] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0565.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0485.html
[3] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0571.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0576.html