Re: File API - Progress - Question about Partial Blob data

2013-08-21 Thread Arun Ranganathan
On Aug 20, 2013, at 7:13 PM, Aymeric Vitte wrote:

 The specs says :
 
  It can also return partial Blob data. Partial Blob data is the part of the 
 File or Blob that has been read into memory currently; when processing the 
 read method readAsText, partial Blob data is a DOMString that is incremented 
 as more bytes are loaded (a portion of the total) [ProgressEvents], and 
 when processing readAsArrayBuffer partial Blob data is an ArrayBuffer 
 [TypedArrays] object consisting of the bytes loaded so far (a portion of the 
 total)[ProgressEvents]. The list below is normative for the result attribute 
 and is the conformance criteria for this attribute
 
 What is the rationale for that? The result attribute should better contain 
 for progress events the latest data read and not the data read from the 
 begining that you could easily reconstitute while the contrary requires more 
 work.
 
 Use case: calculate the hash of a file while you are reading it.


The ask for deltas to be sent with progress notifications came up on this 
listserv before -- see the thread starting at 
http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/0069.html

I agree that the deltas model is also useful, but you can see there's some 
implementation history with XHR here as well.  The use case is to receive the 
file bits as they are constituted from the read (just as with an HTTP request, 
where you get the bits so far till the rest are constituted).

A good way to solve the use case of meaningful deltas might be with the Streams 
API, still TBD.


 PS: I did not test it in all browsers, but unless I am using it wrongly, the 
 result attribute is always null for progress events.


But not null for the onload ?  In many cases, a progress might not fire, 
depending on file size.

-- A*

Re: File API - Progress - Question about Partial Blob data

2013-08-21 Thread Jonas Sicking
On Tue, Aug 20, 2013 at 4:13 PM, Aymeric Vitte vitteayme...@gmail.com wrote:
 The specs says :

  It can also return partial Blob data. Partial Blob data is the part of the
 File or Blob that has been read into memory currently; when processing the
 read method readAsText, partial Blob data is a DOMString that is incremented
 as more bytes are loaded (a portion of the total) [ProgressEvents], and when
 processing readAsArrayBuffer partial Blob data is an ArrayBuffer
 [TypedArrays] object consisting of the bytes loaded so far (a portion of the
 total)[ProgressEvents]. The list below is normative for the result attribute
 and is the conformance criteria for this attribute

 What is the rationale for that? The result attribute should better contain
 for progress events the latest data read and not the data read from the
 begining that you could easily reconstitute while the contrary requires more
 work.

 Use case: calculate the hash of a file while you are reading it.

 Regards

 Aymeric

 PS: I did not test it in all browsers, but unless I am using it wrongly, the
 result attribute is always null for progress events.

I agree that we need a way to read from a File/Blob such that you get
incremental data, i.e. that you only get the data read since the
last data deliver, rather than getting an ever increasing result
representing all data from the beginning of the File/Blob.

However I don't think FileReader is going to be that API. FileReader
was specifically designed after XMLHttpRequest, which I think in
hindsight was a bad idea. However it was the request that we got from
several authors.

I put an initial sketch of what I think the future of File/Blob
reading should look like here:
http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html

That proposal supports what you are asking for, though of course there
are plenty of debate needed on various aspects of that proposal.

All that said, I thought that we had tried to avoid the mistake that
XHR did of exposing ever-increasing partial results as data was being
loaded. Exposing partial data can require quadratic memory allocations
since the implementation will have to keep reallocating and copying
data.

I thought that we had decided not to expose partial results during the
loading. And instead only expose a result at the end of the load. But
I see that the spec now calls for exposing partial results in a few
cases. Did that change?

What does implementations do? Looking at Gecko's implementation I
don't think we ever expose partial results.

/ Jonas



Re: File API - Progress - Question about Partial Blob data

2013-08-21 Thread Aymeric Vitte


Le 21/08/2013 19:03, Jonas Sicking a écrit :

On Tue, Aug 20, 2013 at 4:13 PM, Aymeric Vitte vitteayme...@gmail.com wrote:

The specs says :

 It can also return partial Blob data. Partial Blob data is the part of the
File or Blob that has been read into memory currently; when processing the
read method readAsText, partial Blob data is a DOMString that is incremented
as more bytes are loaded (a portion of the total) [ProgressEvents], and when
processing readAsArrayBuffer partial Blob data is an ArrayBuffer
[TypedArrays] object consisting of the bytes loaded so far (a portion of the
total)[ProgressEvents]. The list below is normative for the result attribute
and is the conformance criteria for this attribute

What is the rationale for that? The result attribute should better contain
for progress events the latest data read and not the data read from the
begining that you could easily reconstitute while the contrary requires more
work.

Use case: calculate the hash of a file while you are reading it.

Regards

Aymeric

PS: I did not test it in all browsers, but unless I am using it wrongly, the
result attribute is always null for progress events.

I agree that we need a way to read from a File/Blob such that you get
incremental data, i.e. that you only get the data read since the
last data deliver, rather than getting an ever increasing result
representing all data from the beginning of the File/Blob.

However I don't think FileReader is going to be that API. FileReader
was specifically designed after XMLHttpRequest, which I think in
hindsight was a bad idea. However it was the request that we got from
several authors.

I put an initial sketch of what I think the future of File/Blob
reading should look like here:
http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html


I have commented this thread for utf-8 streaming, the use case is not 
the same but somewhere similar with the hash example.



That proposal supports what you are asking for, though of course there
are plenty of debate needed on various aspects of that proposal.

All that said, I thought that we had tried to avoid the mistake that
XHR did of exposing ever-increasing partial results as data was being
loaded. Exposing partial data can require quadratic memory allocations
since the implementation will have to keep reallocating and copying
data.

I thought that we had decided not to expose partial results during the
loading. And instead only expose a result at the end of the load. But
I see that the spec now calls for exposing partial results in a few
cases. Did that change?

What does implementations do? Looking at Gecko's implementation I
don't think we ever expose partial results.


FF/Nightly does not expose any partial results, but progress events are 
fired. Probably others are doing the same, so the progress events can 
only be used to display a progress bar, ie nobody cares about the 
partial incremented information defined in the spec.


Whether it's the File API or the Stream API, I don't find it very 
different, I don't know about the implementations details but I don't 
find it very extraordinary to expose for progress events partial non 
incremented results not corellated to the final result.


Combination of the File API, indexedDB, etc (and future WebCrypto) is 
really great, I did not expect to be stucked by something that looks 
trivial: calculate the hash of a file while you are reading it.


So, since apparently there is no use for the incremented data maybe the 
spec should be changed to expose delta data instead for progress events.




/ Jonas


--
jCore
Email :  avi...@jcore.fr
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com




Re: File API - Progress - Question about Partial Blob data

2013-08-21 Thread Aymeric Vitte


Le 21/08/2013 16:16, Arun Ranganathan a écrit :

On Aug 20, 2013, at 7:13 PM, Aymeric Vitte wrote:


The specs says :

 It can also return /partial Blob data/. Partial Blob data is the 
part of the |File| http://www.w3.org/TR/FileAPI/#dfn-file or |Blob| 
http://www.w3.org/TR/FileAPI/#dfn-Blob that has been read into 
memory /currently/; when processing the read method 
http://www.w3.org/TR/FileAPI/#read-methods |readAsText| 
http://www.w3.org/TR/FileAPI/#dfn-readAsText, partial Blob data is 
a |DOMString| that is incremented as more bytes are |loaded| (a 
portion of the |total|) [ProgressEvents 
http://www.w3.org/TR/FileAPI/#ProgressEvents], and when processing 
|readAsArrayBuffer| 
http://www.w3.org/TR/FileAPI/#dfn-readAsArrayBuffer partial Blob 
data is an |ArrayBuffer| [TypedArrays 
http://www.w3.org/TR/FileAPI/#TypedArrays] object consisting of the 
bytes |loaded| so far (a portion of the |total|)[ProgressEvents 
http://www.w3.org/TR/FileAPI/#ProgressEvents]. The list below is 
normative for the |result| attribute and is the conformance criteria 
for this attribute


What is the rationale for that? The result attribute should better 
contain for progress events the latest data read and not the data 
read from the begining that you could easily reconstitute while the 
contrary requires more work.


Use case: calculate the hash of a file while you are reading it.



The ask for deltas to be sent with progress notifications came up on 
this listserv before -- see the thread starting at 
http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/0069.html


I agree that the deltas model is also useful, but you can see there's 
some implementation history with XHR here as well.  The use case is to 
receive the file bits as they are constituted from the read (just as 
with an HTTP request, where you get the bits so far till the rest 
are constituted).


A good way to solve the use case of meaningful deltas might be with 
the Streams API, still TBD.



PS: I did not test it in all browsers, but unless I am using it 
wrongly, the result attribute is always null for progress events.



But not null for the onload ?  In many cases, a progress might not 
fire, depending on file size.


No, only for progress, which does fire for files of several MB.

I don't know about the XHR history but what is the use case of 
incremented data (which is not implemented currently)?




-- A*


--
jCore
Email :  avi...@jcore.fr
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com



Re: File API - Progress - Question about Partial Blob data

2013-08-21 Thread Jonas Sicking
On Wed, Aug 21, 2013 at 3:56 PM, Aymeric Vitte vitteayme...@gmail.com wrote:
 Combination of the File API, indexedDB, etc (and future WebCrypto) is really
 great, I did not expect to be stucked by something that looks trivial:
 calculate the hash of a file while you are reading it.

Until we add API for incremental data, what you can do is to use
Blob.slice() to read the blob in small parts. Not great, but at least
it'll get you unstuck.

/ Jonas