subject:"Re\: Overlap between StreamReader and FileReader"

Re: Overlap between StreamReader and FileReader

2013-11-03 Thread Aymeric Vitte


The idea did not come from mimicing WebRTC:

- pause/unpause: insert pause in the stream, stop processing the data 
when pause is reached (but don't close the operation, see below), buffer 
next data coming in, restart from pause on unpause


Use case: flow control, window flow control gets empty, wait signal from 
the receiver to reinitialize the window and restart


- stop/resume : different from close, stop: insert a specific eof-stop 
in the stream, the API closes the operation while receiving it, buffer 
data, restart the operation on resume in the state it was before 
receiving eof-stop


It's more tricky, use case is the one I gave before: specific 
progressive hash, close a hash and resume it from the state it was 
before closing it, the feature has been asked several time to node for 
example.


Whether it's implementable, I don't know, but I don't see why it could 
not be, uses cases are real (myself but I am not the only one)


Regards,

Aymeric

Le 30/10/2013 12:49, Takeshi Yoshino a écrit :
On Wed, Oct 30, 2013 at 8:14 PM, Takeshi Yoshino tyosh...@google.com 
mailto:tyosh...@google.com wrote:


On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte
vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote:

- pause: pause the stream, do not send eof



Sorry, what will be paused? Output?


http://lists.w3.org/Archives/Public/public-webrtc/2013Oct/0059.html
http://www.w3.org/2011/04/webrtc/wiki/Transport_Control#Pause.2Fresume

So, you're suggesting that we make Stream be a convenient point where 
we can dam up data flow and skip adding methods to pausing data 
producing and consuming to producer/consumer APIs? I.e. we make it 
able to prevent data queued in a Stream from being read. This 
typically means asynchronously suspending ongoing pipe() or read() 
call on the Stream with no-argument or very large argument.


- unpause: restart the stream

And flow control should be back and explicit, not sure right
now how to define it but I think it's impossible for a js app
to do a precise flow control, and for existing APIs like
WebSockets it's not easy to control the flow and avoid in some
situations to overload the UA.



--
Peersm : http://www.peersm.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms

Re: Overlap between StreamReader and FileReader

2013-10-30 Thread Takeshi Yoshino

On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

  Your filter idea seems to be equivalent to a createStream that I
 suggested some time ago (like node), what about:

 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey,
 sourceStream).createStream();

 So you don't need to modify the APIs where you can not specify the
 responseType.

 I was thinking to add stop/resume and pause/unpause:

 - stop: insert eof in the stream


close() does this.


  Example : finalize the hash when eof is received

 - resume: restart from where the stream stopped
 Example : restart the hash from the state the operation was before
 receiving eof (related to Issue22 in WebCrypto that was closed without any
 solution, might imply to clone the state of the operation)


Should it really be a part of Streams API? How about just making the filter
(not Stream itself) returned by WebCrypto reusable and add some method to
recycle it?


 - pause: pause the stream, do not send eof


Sorry, what will be paused? Output?


  - unpause: restart the stream

 And flow control should be back and explicit, not sure right now how to
 define it but I think it's impossible for a js app to do a precise flow
 control, and for existing APIs like WebSockets it's not easy to control the
 flow and avoid in some situations to overload the UA.

 Regards,

 Aymeric

 Le 21/10/2013 13:14, Takeshi Yoshino a écrit :

  Sorry for blank of ~2 weeks.

  On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

  I am still not very familiar with promises, but if I take your
 preceeding example:


 var sourceStream = xhr.response;
 var resultStream = new Stream();
 var fileWritingPromise = fileWriter.write(resultStream);
 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt,
 aesKey, sourceStream, resultStream);
  Promise.all(fileWritingPromise, encryptionPromise).then(
   ...
 );


  I made a mistake. The argument of Promise.all should be an Array. So,
 [fileWritingPromise, encryptionPromise].




  shoud'nt it be more something like:

 var sourceStream = xhr.response;
 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt,
 aesKey);
 var resultStream=sourceStream.pipe(encryptionPromise);
 var fileWritingPromise = fileWriter.write(resultStream);
  Promise.all(fileWritingPromise, encryptionPromise).then(
   ...
 );


  Promises just tell the user completion of each operation with some value
 indicating the result of the operation. It's not destination of data.

  Do you think it's good to create objects representing each encrypt
 operation? So, some objects called filter is introduced and the code
 would be like:

  var pipeToFilterPromise;

  var encryptionFilter;
  var fileWriter;

  xhr.onreadystatechange = function() {
   ...
   } else if (this.readyState == this.LOADING) {
  if (this.status != 200) {
   ...
 }

  var sourceStream = xhr.response;

  encryptionFilter =
 crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey);
 // Starts the filter.
 var encryptionPromise = encryptionFilter.encrypt();
 // Also starts pouring data but separately from promise creation.
  pipeToFilterPromise = sourceStream.pipe(encryptionFilter);

  fileWriter = ...;
 // encryptionFilter works as data producer for FileWriter.
 var fileWritingPromise = fileWriter.write(encryptionFilter);

  // Set only handler for rejection now.
  pipeToFilterPromise.catch(
function(result) {
 xhr.abort();
 encryptionFilter.abort();
  fileWriter.abort();
   }
  );

  encryptionPromise.catch(
function(result) {
  xhr.abort();
  fileWriter.abort();
   }
  );

  fileWritingPromise.catch(
function(result) {
  xhr.abort();
  encryptionFilter.abort();
   }
  );

   // As encryptionFilter will be (successfully) closed only
  // when XMLHttpRequest and pipe() are both successful.
 // So, it's ok to set handler for fulfillment now.
  Promise.all([encryptionPromise, fileWritingPromise]).then(
function(result) {
 // Done everything successfully!
 // We come here only when encryptionFilter is close()-ed.
 fileWriter.close();
 processFile();
   }
 );
   } else if (this.readyState == this.DONE) {
  if (this.status != 200) {
   encryptionFilter.abort();
   fileWriter.abort();
 } else {
   // Now we know that XHR was successful.
   // Let's close() the filter to finish encryption
   // successfully.
   pipeToFilterPromise.then(
 function(result) {
   // XMLHttpRequest closes sourceStream but pipe()
   // resolves pipeToFilterPromise without closing
   // encryptionFilter.
   encryptionFilter.close();
 }
   );
 }
}
  };
 xhr.send();

  encryptionFilter has the same interface as normal stream but

Re: Overlap between StreamReader and FileReader

2013-10-30 Thread Takeshi Yoshino

On Wed, Oct 30, 2013 at 8:14 PM, Takeshi Yoshino tyosh...@google.comwrote:

 On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

 - pause: pause the stream, do not send eof



 Sorry, what will be paused? Output?


http://lists.w3.org/Archives/Public/public-webrtc/2013Oct/0059.html
http://www.w3.org/2011/04/webrtc/wiki/Transport_Control#Pause.2Fresume

So, you're suggesting that we make Stream be a convenient point where we
can dam up data flow and skip adding methods to pausing data producing and
consuming to producer/consumer APIs? I.e. we make it able to prevent data
queued in a Stream from being read. This typically means asynchronously
suspending ongoing pipe() or read() call on the Stream with no-argument or
very large argument.




  - unpause: restart the stream

 And flow control should be back and explicit, not sure right now how to
 define it but I think it's impossible for a js app to do a precise flow
 control, and for existing APIs like WebSockets it's not easy to control the
 flow and avoid in some situations to overload the UA.

Re: Overlap between StreamReader and FileReader

2013-10-23 Thread Aymeric Vitte

Your filter idea seems to be equivalent to a createStream that I 
suggested some time ago (like node), what about:


var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, 
aesKey, sourceStream).createStream();


So you don't need to modify the APIs where you can not specify the 
responseType.


I was thinking to add stop/resume and pause/unpause:

- stop: insert eof in the stream
Example : finalize the hash when eof is received

- resume: restart from where the stream stopped
Example : restart the hash from the state the operation was before 
receiving eof (related to Issue22 in WebCrypto that was closed without 
any solution, might imply to clone the state of the operation)


- pause: pause the stream, do not send eof

- unpause: restart the stream

And flow control should be back and explicit, not sure right now how to 
define it but I think it's impossible for a js app to do a precise flow 
control, and for existing APIs like WebSockets it's not easy to control 
the flow and avoid in some situations to overload the UA.


Regards,

Aymeric

Le 21/10/2013 13:14, Takeshi Yoshino a écrit :

Sorry for blank of ~2 weeks.

On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.com 
mailto:vitteayme...@gmail.com wrote:


I am still not very familiar with promises, but if I take your
preceeding example:


var sourceStream = xhr.response;
var resultStream = new Stream();
var fileWritingPromise = fileWriter.write(resultStream);
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt,
aesKey, sourceStream, resultStream);
Promise.all(fileWritingPromise, encryptionPromise).then(
  ...
);


I made a mistake. The argument of Promise.all should be an Array. So, 
[fileWritingPromise, encryptionPromise].




shoud'nt it be more something like:

var sourceStream = xhr.response;
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt,
aesKey);
var resultStream=sourceStream.pipe(encryptionPromise);
var fileWritingPromise = fileWriter.write(resultStream);
Promise.all(fileWritingPromise, encryptionPromise).then(
  ...
);


Promises just tell the user completion of each operation with some 
value indicating the result of the operation. It's not destination of 
data.


Do you think it's good to create objects representing each encrypt 
operation? So, some objects called filter is introduced and the code 
would be like:


var pipeToFilterPromise;

var encryptionFilter;
var fileWriter;

xhr.onreadystatechange = function() {
  ...
  } else if (this.readyState == this.LOADING) {
if (this.status != 200) {
  ...
}

var sourceStream = xhr.response;

encryptionFilter = 
crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey);

// Starts the filter.
var encryptionPromise = encryptionFilter.encrypt();
// Also starts pouring data but separately from promise creation.
pipeToFilterPromise = sourceStream.pipe(encryptionFilter);

fileWriter = ...;
// encryptionFilter works as data producer for FileWriter.
var fileWritingPromise = fileWriter.write(encryptionFilter);

// Set only handler for rejection now.
pipeToFilterPromise.catch(
  function(result) {
xhr.abort();
encryptionFilter.abort();
fileWriter.abort();
  }
);

encryptionPromise.catch(
  function(result) {
xhr.abort();
fileWriter.abort();
  }
);

fileWritingPromise.catch(
  function(result) {
xhr.abort();
encryptionFilter.abort();
  }
);

// As encryptionFilter will be (successfully) closed only
// when XMLHttpRequest and pipe() are both successful.
// So, it's ok to set handler for fulfillment now.
Promise.all([encryptionPromise, fileWritingPromise]).then(
  function(result) {
// Done everything successfully!
// We come here only when encryptionFilter is close()-ed.
fileWriter.close();
processFile();
  }
);
  } else if (this.readyState == this.DONE) {
if (this.status != 200) {
  encryptionFilter.abort();
  fileWriter.abort();
} else {
  // Now we know that XHR was successful.
  // Let's close() the filter to finish encryption
  // successfully.
  pipeToFilterPromise.then(
function(result) {
  // XMLHttpRequest closes sourceStream but pipe()
  // resolves pipeToFilterPromise without closing
  // encryptionFilter.
  encryptionFilter.close();
}
  );
}
  }
};
xhr.send();

encryptionFilter has the same interface as normal stream but encrypts 
piped data. Encrypted data is readable from it. It has special 
methods, encrypt() and abort().


processFile() is hypothetical function must be called only when all of 
loading, encryption and saving to file were successful.



or

var sourceStream = xhr.response;
var encryptionPromise =

Re: Overlap between StreamReader and FileReader

2013-10-21 Thread Takeshi Yoshino

Sorry for blank of ~2 weeks.

On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

  I am still not very familiar with promises, but if I take your preceeding
 example:


 var sourceStream = xhr.response;
 var resultStream = new Stream();
 var fileWritingPromise = fileWriter.write(resultStream);
 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey,
 sourceStream, resultStream);
  Promise.all(fileWritingPromise, encryptionPromise).then(
   ...
 );


I made a mistake. The argument of Promise.all should be an Array. So,
[fileWritingPromise, encryptionPromise].




 shoud'nt it be more something like:

 var sourceStream = xhr.response;
 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey);
 var resultStream=sourceStream.pipe(encryptionPromise);
 var fileWritingPromise = fileWriter.write(resultStream);
  Promise.all(fileWritingPromise, encryptionPromise).then(
   ...
 );


Promises just tell the user completion of each operation with some value
indicating the result of the operation. It's not destination of data.

Do you think it's good to create objects representing each encrypt
operation? So, some objects called filter is introduced and the code
would be like:

var pipeToFilterPromise;

var encryptionFilter;
var fileWriter;

xhr.onreadystatechange = function() {
  ...
  } else if (this.readyState == this.LOADING) {
 if (this.status != 200) {
  ...
}

var sourceStream = xhr.response;

encryptionFilter =
crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey);
// Starts the filter.
var encryptionPromise = encryptionFilter.encrypt();
// Also starts pouring data but separately from promise creation.
pipeToFilterPromise = sourceStream.pipe(encryptionFilter);

fileWriter = ...;
// encryptionFilter works as data producer for FileWriter.
var fileWritingPromise = fileWriter.write(encryptionFilter);

// Set only handler for rejection now.
pipeToFilterPromise.catch(
  function(result) {
xhr.abort();
encryptionFilter.abort();
fileWriter.abort();
  }
);

encryptionPromise.catch(
  function(result) {
xhr.abort();
fileWriter.abort();
  }
);

fileWritingPromise.catch(
  function(result) {
xhr.abort();
encryptionFilter.abort();
  }
);

// As encryptionFilter will be (successfully) closed only
// when XMLHttpRequest and pipe() are both successful.
// So, it's ok to set handler for fulfillment now.
Promise.all([encryptionPromise, fileWritingPromise]).then(
  function(result) {
// Done everything successfully!
// We come here only when encryptionFilter is close()-ed.
fileWriter.close();
processFile();
  }
);
  } else if (this.readyState == this.DONE) {
 if (this.status != 200) {
  encryptionFilter.abort();
  fileWriter.abort();
} else {
  // Now we know that XHR was successful.
  // Let's close() the filter to finish encryption
  // successfully.
  pipeToFilterPromise.then(
function(result) {
  // XMLHttpRequest closes sourceStream but pipe()
  // resolves pipeToFilterPromise without closing
  // encryptionFilter.
  encryptionFilter.close();
}
  );
}
  }
};
xhr.send();

encryptionFilter has the same interface as normal stream but encrypts piped
data. Encrypted data is readable from it. It has special methods, encrypt()
and abort().

processFile() is hypothetical function must be called only when all of
loading, encryption and saving to file were successful.



 or

 var sourceStream = xhr.response;
 var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey);
 var hashPromise = crypto.subtle.digest(hash);
 var resultStream = sourceStream.pipe([encryptionPromise,hashPromise]);
 var fileWritingPromise = fileWriter.write(resultStream);
 Promise.all([fileWritingPromise, resultStream]).then(
   ...
 );


and this should be:

var sourceStream = xhr.response;

encryptionFilter =
crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey);
var encryptionPromise = encryptionFilter.crypt();

hashFilter = crypto.subtle.createDigestFilter(hash);
var hashPromise = hashFilter.digest();

pipeToFiltersPromise = sourceStream.pipe([encryptionFilter, hashFilter]);

var encryptedDataWritingPromise = fileWriter.write(encryptionFilter);

var hashWritingPromise =
  Promise.all([encryptionPromise, encryptedDataWritingPromise]).then(
function(result) {
  return fileWriter.write(hashFilter)
},
...
  );

Promise.all([hashPromise, hashWritingPromise]).then(
  function(result) {
fileWriter.close();
processFile();
  },
  ...
);

Or, we can also choose to let the writer API to create a special object
that has the Stream interface for receiving input and then let
encryptionFilter and hashFilter to pipe() to it.

...
pipeToFiltersPromise =

Re: Overlap between StreamReader and FileReader

2013-10-04 Thread Aymeric Vitte

I am still not very familiar with promises, but if I take your 
preceeding example:


var sourceStream = xhr.response;
var resultStream = new Stream();
var fileWritingPromise = fileWriter.write(resultStream);
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, 
aesKey, sourceStream, resultStream);

Promise.all(fileWritingPromise, encryptionPromise).then(
  ...
);


shoud'nt it be more something like:

var sourceStream = xhr.response;
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey);
var resultStream=sourceStream.pipe(encryptionPromise);
var fileWritingPromise = fileWriter.write(resultStream);
Promise.all(fileWritingPromise, encryptionPromise).then(
  ...
);

or

var sourceStream = xhr.response;
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey);
var hashPromise = crypto.subtle.digest(hash);
var resultStream = sourceStream.pipe([encryptionPromise,hashPromise]);
var fileWritingPromise = fileWriter.write(resultStream);
Promise.all(fileWritingPromise, resultStream).then(
  ...
);

Regards

Aymeric

Le 03/10/2013 10:27, Takeshi Yoshino a écrit :
Formatted and published my latest proposal at github after 
incorporating Aymeric's multi-dest idea.


http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html


On Sat, Sep 28, 2013 at 11:45 AM, Kenneth Russell k...@google.com 
mailto:k...@google.com wrote:


This looks nice. It looks like it should already handle the flow
control issues mentioned earlier in the thread, simply by
performing the read on demand, though reporting the result
asynchronously.


Thanks, Kenneth for reviewing.


--
Peersm : http://www.peersm.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms

Re: Overlap between StreamReader and FileReader

2013-10-03 Thread Takeshi Yoshino

Formatted and published my latest proposal at github after incorporating
Aymeric's multi-dest idea.

http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html


On Sat, Sep 28, 2013 at 11:45 AM, Kenneth Russell k...@google.com wrote:

 This looks nice. It looks like it should already handle the flow control
 issues mentioned earlier in the thread, simply by performing the read on
 demand, though reporting the result asynchronously.


Thanks, Kenneth for reviewing.

Re: Overlap between StreamReader and FileReader

2013-09-26 Thread Aymeric Vitte


Looks good, comments/questions :

- what's the use of readEncoding?

- StreamReadType: add MediaStream? (and others if existing)

- would it be possible to pipe from StreamReadType to other StreamReadType?

- would it be possible to pipe from a source to different targets (my 
example of encrypt/hash at the same time)?


- what is the link between the API and the Stream 
(responseType='stream')? How do you handle this for APIs where 
responseType does not really apply (mspack, crypto...)


Regards

Aymeric

Le 26/09/2013 06:17, Takeshi Yoshino a écrit :
As we don't see any strong demand for flow control and sync read 
functionality, I've revised the proposal.


Though we can separate state/error signaling from Stream and keep them 
done by each API (e.g. XHR) as Aymeric said, EoF signal still needs to 
be conveyed through Stream.




enum StreamReadType {
  ,
  blob,
  arraybuffer,
  text
};

interface StreamConsumeResult {
  readonly attribute boolean eof;
  readonly any data;
  readonly unsigned long long size;
};

[Constructor(optional DOMString mime)]
interface Stream {
  readonly attribute DOMString type;  // MIME type

  // Rejected on error. No more write op shouldn't be made.
  //
  // Fulfilled when the write completes. It doesn't guarantee that the 
written data has been

  // read out successfully.
  //
  // The contents of ArrayBufferView must not be modified until the 
promise is fulfilled.

  //
  // Fulfill may be delayed when the Stream considers itself to be full.
  //
  // write(), close() must not be called again until the Promise of 
the last write() is fulfilled.

  Promisevoid write((DOMString or ArrayBufferView or Blob)? data);
  void close();

  attribute StreamReadType readType;
  attribute DOMString readEncoding;

  // read(), skip(), pipe() must not be called again until the Promise 
of the last read(), skip(), pipe() is fulfilled.


  // Rejected on error. No more read op shouldn't be made.
  //
  // If size is specified,
  // - if EoF: fulfilled with data up to EoF
  // - otherwise: fulfilled with data of size bytes
  //
  // If size is omitted, (all or part of) data available for read now 
will be returned.

  //
  // If readType is set to text, size of the result may be smaller 
than the value specified for the size argument.

  PromiseStreamConsumeResult read(optional [Clamp] long long size);

  // Rejected on error. Fulfilled on completion.
  //
  // .data of result is not used. .size of result is the skipped amount.
  PromiseStreamConsumeResult skip([Clamp] long long size);  // .data 
is skipped size


  // Rejected on error. Fulfilled on completion.
  //
  // If size is omitted, transfer until EoF is encountered.
  //
  // .data of result is not used. .size of result is the size of data 
transferred.
  PromiseStreamConsumeResult pipe(Stream destination, optional 
[Clamp] long long size);

};



--
Peersm : http://www.peersm.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms

Re: Overlap between StreamReader and FileReader

2013-09-26 Thread Takeshi Yoshino

On Thu, Sep 26, 2013 at 6:36 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

 Looks good, comments/questions :

 - what's the use of readEncoding?


Overriding charset specified in .type for read op. It's weird but we can
ask an app to overwrite .type instead.



 - StreamReadType: add MediaStream? (and others if existing)


Maybe, if there's clear rule to convert binary stream + MIME type into
MediaStream object.



 - would it be possible to pipe from StreamReadType to other StreamReadType?


pipe() tells the receiver with which value of StreamReadType the pipe() was
called. Receiver APIs may be designed to accept either mode or both modes.



 - would it be possible to pipe from a source to different targets (my
 example of encrypt/hash at the same time)?


I missed it. Your mirroring method (making pipe accept multiple Stream)
looks good.

The problem is what to do when one of destinations is write blocked. Maybe
we want to read data from the source as the fastest consumer consumes and
save read data for slowest one. When should we fulfill the promise?
Completion of read from the source, completion of write to all
destinations, etc.



 - what is the link between the API and the Stream (responseType='stream')?
 How do you handle this for APIs where responseType does not really apply
 (mspack, crypto...)


- make APIs to return a Stream for read (write) like
XHR.responseType='stream'
- make APIs to accept a Stream for read (write)

Either should work as we have pipe().

E.g.

var sourceStream = xhr.response;
var resultStream = new Stream();
var fileWritingPromise = fileWriter.write(resultStream);
var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey,
sourceStream, resultStream);
Promise.all(fileWritingPromise, encryptionPromise).then(
  ...
);



I also found a point needs clarification

- pipe() does eof or not. I think we don't want automatic eof.

Re: Overlap between StreamReader and FileReader

2013-09-25 Thread Aymeric Vitte



Le 24/09/2013 21:24, Takeshi Yoshino a écrit :
On Wed, Sep 25, 2013 at 12:41 AM, Aymeric Vitte 
vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote:


Did you see
http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html
?


Yes. This example seems to be showing how to connect only 
producer/consumer APIs which support Stream. Right?


Yes but if something like createStream is generic then any API could 
support it, for further APIs or next versions it can be built in.




In such a case, all the flow control stuff would be basically hidden, 
and if necessary each producer/consumer/transformer/filter/etc. may 
expose flow control related parameter in their own form, and configure 
connected input/output streams accordingly. E.g. stream_xhr may choose 
to have large write buffer for performance, or have small one and make 
some backpressure to stream_ws1 for memory efficiency.


Yes



My understanding is that the flow control APIs like mine are intended 
to be used by JS code implementing some converter, consumer, etc. 
while built-in stuff like WebCrypt would be evolved to accept Stream 
directly and handle flow control in e.g. C++ world.




BTW, I'm discussing this to provide data points to decide whether to 
include flow control API or not. I'm not pushing it. I appreciate if 
other participants express opinions about this.


Not sure to get what you mean between your API flow control and built-in 
flow control... I think the main purpose of the Stream API should be to 
handle more efficiently streaming without having to handle ArrayBuffers 
copy, split, concat, etc, to abstract the use of ArrayBuffer, 
ArrayBufferView, Blob, txt so you don't spend your time converting 
things and to connect simply different streams.


--
Peersm : http://www.peersm.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms

Re: Overlap between StreamReader and FileReader

2013-09-25 Thread Takeshi Yoshino

On Wed, Sep 25, 2013 at 10:55 PM, Aymeric Vitte vitteayme...@gmail.comwrote:


  My understanding is that the flow control APIs like mine are intended to
 be used by JS code implementing some converter, consumer, etc. while
 built-in stuff like WebCrypt would be evolved to accept Stream directly and
 handle flow control in e.g. C++ world.


  

  BTW, I'm discussing this to provide data points to decide whether to
 include flow control API or not. I'm not pushing it. I appreciate if other
 participants express opinions about this.



 Not sure to get what you mean between your API flow control and built-in
 flow control... I think the main purpose of the Stream API should be to
 handle more efficiently streaming without having to handle ArrayBuffers
 copy, split, concat, etc, to abstract the use of ArrayBuffer,
 ArrayBufferView, Blob, txt so you don't spend your time converting things
 and to connect simply different streams.


JS flow control API is for JS code to manually control threshold, buffer
size, etc. so that JS code can consume, produce data to/from Stream.

Built-in flow control is C++ (or any other lang implementing the UA)
interface that will be used when streams are connected with pipe(). Maybe
it would have similar interface as JS flow control API.

Re: Overlap between StreamReader and FileReader

2013-09-25 Thread Takeshi Yoshino

As we don't see any strong demand for flow control and sync read
functionality, I've revised the proposal.

Though we can separate state/error signaling from Stream and keep them done
by each API (e.g. XHR) as Aymeric said, EoF signal still needs to be
conveyed through Stream.



enum StreamReadType {
  ,
  blob,
  arraybuffer,
  text
};

interface StreamConsumeResult {
  readonly attribute boolean eof;
  readonly any data;
  readonly unsigned long long size;
};

[Constructor(optional DOMString mime)]
interface Stream {
  readonly attribute DOMString type;  // MIME type

  // Rejected on error. No more write op shouldn't be made.
  //
  // Fulfilled when the write completes. It doesn't guarantee that the
written data has been
  // read out successfully.
  //
  // The contents of ArrayBufferView must not be modified until the promise
is fulfilled.
  //
  // Fulfill may be delayed when the Stream considers itself to be full.
  //
  // write(), close() must not be called again until the Promise of the
last write() is fulfilled.
  Promisevoid write((DOMString or ArrayBufferView or Blob)? data);
  void close();

  attribute StreamReadType readType;
  attribute DOMString readEncoding;

  // read(), skip(), pipe() must not be called again until the Promise of
the last read(), skip(), pipe() is fulfilled.

  // Rejected on error. No more read op shouldn't be made.
  //
  // If size is specified,
  // - if EoF: fulfilled with data up to EoF
  // - otherwise: fulfilled with data of size bytes
  //
  // If size is omitted, (all or part of) data available for read now will
be returned.
  //
  // If readType is set to text, size of the result may be smaller than the
value specified for the size argument.
  PromiseStreamConsumeResult read(optional [Clamp] long long size);

  // Rejected on error. Fulfilled on completion.
  //
  // .data of result is not used. .size of result is the skipped amount.
  PromiseStreamConsumeResult skip([Clamp] long long size);  // .data is
skipped size

  // Rejected on error. Fulfilled on completion.
  //
  // If size is omitted, transfer until EoF is encountered.
  //
  // .data of result is not used. .size of result is the size of data
transferred.
  PromiseStreamConsumeResult pipe(Stream destination, optional [Clamp]
long long size);
};

Re: Overlap between StreamReader and FileReader

2013-09-24 Thread Aymeric Vitte

Did you see 
http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html ?


Attempt to find a link between the data producers APIs and a Streams API 
like yours.


Regards

Aymeric

Le 20/09/2013 15:16, Takeshi Yoshino a écrit :
On Sat, Sep 14, 2013 at 12:03 AM, Aymeric Vitte 
vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote:



I take this example to understand if this could be better with a
built-in Stream flow control, if so, after you have defined the
right parameters (if possible) for the streams flow control, you
could process delta data while reading the file and restream them
directly via WebSockets, and this would be great but again not
sure that a universal solution can be found.


I think what we can do is just providing helper to make it easier to 
build such an intelligent and app specific flow control logic.


Maybe one of the points of your example is that we're not always be 
able to calculate good readableThreshold. I'm also not so sure how 
much of apps in the world can benefit from this kind of APIs.


For consumers that can do flow control well on receive window basis, 
my API should work well (unnecessary events are not dispatched. chunks 
are consolidated. lazier ArrayBuffer creation). WebSocket has (broken) 
bufferedAmount attribute for window based flow control. Are you using 
it as a hint?




--
Peersm : http://www.peersm.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms

Re: Overlap between StreamReader and FileReader

2013-09-24 Thread Takeshi Yoshino

On Wed, Sep 25, 2013 at 12:41 AM, Aymeric Vitte vitteayme...@gmail.comwrote:

  Did you see
 http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html ?


Yes. This example seems to be showing how to connect only producer/consumer
APIs which support Stream. Right?

In such a case, all the flow control stuff would be basically hidden, and
if necessary each producer/consumer/transformer/filter/etc. may expose flow
control related parameter in their own form, and configure connected
input/output streams accordingly. E.g. stream_xhr may choose to have large
write buffer for performance, or have small one and make some backpressure
to stream_ws1 for memory efficiency.

My understanding is that the flow control APIs like mine are intended to be
used by JS code implementing some converter, consumer, etc. while built-in
stuff like WebCrypt would be evolved to accept Stream directly and handle
flow control in e.g. C++ world.



BTW, I'm discussing this to provide data points to decide whether to
include flow control API or not. I'm not pushing it. I appreciate if other
participants express opinions about this.

Re: Overlap between StreamReader and FileReader

2013-09-20 Thread Takeshi Yoshino

On Sat, Sep 14, 2013 at 12:03 AM, Aymeric Vitte vitteayme...@gmail.comwrote:


 I take this example to understand if this could be better with a built-in
 Stream flow control, if so, after you have defined the right parameters (if
 possible) for the streams flow control, you could process delta data while
 reading the file and restream them directly via WebSockets, and this would
 be great but again not sure that a universal solution can be found.


I think what we can do is just providing helper to make it easier to build
such an intelligent and app specific flow control logic.

Maybe one of the points of your example is that we're not always be able to
calculate good readableThreshold. I'm also not so sure how much of apps in
the world can benefit from this kind of APIs.

For consumers that can do flow control well on receive window basis, my API
should work well (unnecessary events are not dispatched. chunks are
consolidated. lazier ArrayBuffer creation). WebSocket has (broken)
bufferedAmount attribute for window based flow control. Are you using it as
a hint?

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Aymeric Vitte

Here for the examples: 
http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html


Simple ones leading to a simple Streams interface, I thought this was 
the spirit of the original Streams API proposal.


Now you want a stream interface so you can code some js like mspack on 
top of it.


I am still missing a part of the puzzle or how to use it: as you mention 
the stream is coming from somewhere (File, indexedDB, WebSocket, XHR, 
WebRTC, etc) you have a limited choice of APIs to get it, so msgpack 
will act on top of one of those APIs, no? (then back to the examples above)


How can you get the data another way?

Regards,

Aymeric

Le 13/09/2013 06:36, Takeshi Yoshino a écrit :
On Fri, Sep 13, 2013 at 5:15 AM, Aymeric Vitte vitteayme...@gmail.com 
mailto:vitteayme...@gmail.com wrote:


Isaac said too So, just to be clear, I'm **not** suggesting that
browser streams copy Node streams verbatim..


I know. I wanted to kick the discussion which was stopped for 2 weeks.

Unless you want to do node inside browsers (which would be great
but seems unlikely) I still don't see the relation between this
kind of proposal and existing APIs.


What do you mean by existing APIs? I was thinking that we've been 
discussing what Stream read/write API for manual consuming/producing 
by JavaScript code should be like.


Could you please give an example very different from the ones I
gave already?


Sorry, which mail?

One of what I was imaging is protocol parsing. Such as msgpack, 
protocol buffer. It's good that ArrayBuffers of exact size is obtained.


OTOH, as someone pointed out, Stream should have some flow control 
mechanism not to pull unlimited amount of data from async storage, 
network, etc. readableSize in my proposal is an example of how we make 
the limit controllable by an app.


We could also depend on the size argument of read() call. But thinking 
of protocol parsing again, it's common that we have small fields such 
as 4, 8, 16 bytes. If read(size) is configured to pull size bytes from 
async storage, it's inefficient. Maybe we could have some hard coded 
limit, e.g. 1MiB and use max(hardCodedLimit, requestedReadSize).


I'm fine with the latter.

You have reverted to EventTarget too instead of promises, why?


There was no intention to object against use of Promise. Sorry that I 
wasn't clear. I'm rather interested in receiving sequence of data as 
they become available (corresponds to Jonas's ChunkedData version read 
methods) with just one read call. Sorry that I didn't mention 
explicitly, but listeners on the proposed API came from ChunkedData 
object. I thought we can put them on Stream itself by giving up 
multiple read scenario.


writeable/readableThreshold can be safely removed from the API if we 
agree it's not important. If the threshold stuff are removed, flush() 
and pull() will also be removed.




--
jCore
Email :  avi...@jcore.fr
Peersm : http://www.peersm.com
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Takeshi Yoshino

On Fri, Sep 13, 2013 at 6:08 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

 Now you want a stream interface so you can code some js like mspack on top
 of it.

 I am still missing a part of the puzzle or how to use it: as you mention
 the stream is coming from somewhere (File, indexedDB, WebSocket, XHR,
 WebRTC, etc) you have a limited choice of APIs to get it, so msgpack will
 act on top of one of those APIs, no? (then back to the examples above)

 How can you get the data another way?


Do you mean that those data producer APIs should be changed to provide
read-by-delta-data, and manipulation of data by js code should happen there
instead of at the output of Stream?

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Aymeric Vitte



Le 13/09/2013 14:23, Takeshi Yoshino a écrit :
On Fri, Sep 13, 2013 at 6:08 PM, Aymeric Vitte vitteayme...@gmail.com 
mailto:vitteayme...@gmail.com wrote:


Now you want a stream interface so you can code some js like
mspack on top of it.

I am still missing a part of the puzzle or how to use it: as you
mention the stream is coming from somewhere (File, indexedDB,
WebSocket, XHR, WebRTC, etc) you have a limited choice of APIs to
get it, so msgpack will act on top of one of those APIs, no? (then
back to the examples above)

How can you get the data another way?


Do you mean that those data producer APIs should be changed to provide 
read-by-delta-data, and manipulation of data by js code should happen 
there instead of at the output of Stream?


Yes, exactly, except if you/someone see another way of getting the data 
inside the browser and turning the flow into a stream without using 
these APIs.


Regards,

Aymeric

--
jCore
Email :  avi...@jcore.fr
Peersm : http://www.peersm.com
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Takeshi Yoshino

Since I joined discussion recently, I don't know the original idea behind
the Stream+XHR integration approach (response returns Stream object) as in
current Streams API spec. But one advantage of it I come up with is that we
can keep change to those producer APIs small. If we decide to add methods
for example for flow control (though it's still under question), such stuff
go to Stream, not XHR, etc.

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Takeshi Yoshino

On Fri, Sep 13, 2013 at 9:50 PM, Aymeric Vitte vitteayme...@gmail.comwrote:


 Le 13/09/2013 14:23, Takeshi Yoshino a écrit :

 Do you mean that those data producer APIs should be changed to provide
 read-by-delta-data, and manipulation of data by js code should happen there
 instead of at the output of Stream?



 Yes, exactly, except if you/someone see another way of getting the data
 inside the browser and turning the flow into a stream without using these
 APIs.


I agree that there're various states and things to handle for each of the
producer APIs, and it might be judicious not to convey such API specific
info/signal through Stream.

I don't think it's bad to convert xhr.DONE to stream.close() manually as in
your example
http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html.

But, regarding flow control, as I said in the other mail just posted, if we
start thinking of flow control more seriously, maybe the right approach is
to have unified flow control method and the point to define such a
fine-grained flow control is Stream, not each API. If we're not, yes, maybe
your proposal (deltaResponse) should be enough.

Re: Overlap between StreamReader and FileReader

2013-09-13 Thread Aymeric Vitte



Le 13/09/2013 15:11, Takeshi Yoshino a écrit :
On Fri, Sep 13, 2013 at 9:50 PM, Aymeric Vitte vitteayme...@gmail.com 
mailto:vitteayme...@gmail.com wrote:



Le 13/09/2013 14:23, Takeshi Yoshino a écrit :

Do you mean that those data producer APIs should be changed to
provide read-by-delta-data, and manipulation of data by js code
should happen there instead of at the output of Stream?


Yes, exactly, except if you/someone see another way of getting the
data inside the browser and turning the flow into a stream without
using these APIs.


I agree that there're various states and things to handle for each of 
the producer APIs, and it might be judicious not to convey such API 
specific info/signal through Stream.


I don't think it's bad to convert xhr.DONE to stream.close() manually 
as in your example 
http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html.


But, regarding flow control, as I said in the other mail just posted, 
if we start thinking of flow control more seriously, maybe the right 
approach is to have unified flow control method and the point to 
define such a fine-grained flow control is Stream, not each API.


Maybe, I was not at the start of this thread too so I don't know exactly 
what was the original idea (and hope I am not screwing it up here). But 
I am not sure it's possible to define a universal flow control.


Example: I am currently experiencing some flow control issues for 
project [1], basically the sender reads a file AsArrayBuffer from 
indexedDB where it's stored as a Blob. Since we can not get delta data 
while reading the File for now, the sender waits for having the whole 
ArrayBuffer, then slices it, processes the blocks and sends them via 
WebSockets. If you implement a basic loop, of course you overload the 
sender's UA and connection. So the system makes some calculation in 
order to allow only half of the bandwidth to be used, aggregate the 
blocks until it finds out that the size of the aggregation meets the 
bandwidth requirement for the aggregated blocks to be sent every 50ms.


Then it uses a poor setTimeout to flush the data which screw up all the 
preceding calculations since setTimeout fires whenever it likes. Maybe 
there are smarter ways to do this, I was thinking to use workers so you 
can get a more precise clock via postMessages but I did not try.


In addition to the bandwidth control there is a window for flow control.

I take this example to understand if this could be better with a 
built-in Stream flow control, if so, after you have defined the right 
parameters (if possible) for the streams flow control, you could process 
delta data while reading the file and restream them directly via 
WebSockets, and this would be great but again not sure that a universal 
solution can be found.



If we're not, yes, maybe your proposal (deltaResponse) should be enough.


What is sure is that delta data should be made available instead of 
incremental ones.


[1] http://www.peersm.com

--
jCore
Email :  avi...@jcore.fr
Peersm : http://www.peersm.com
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-09-12 Thread Aymeric Vitte

Apparently we are not talking about the same thing, while I am thinking 
to a high level interface your interface is taking care of the 
underlying level.


Like node's streams, node had to define it since it was not existing 
(but is someone using node's streams as such or does everybody use the 
higher levels (net, ssl/tls, http, https)?), I have been working since 
quite some time on projects streaming things in all possible ways inside 
browsers or with node and I never felt any need for such a proposal.


So, to understand where the mismatch comes from, could you please 
highlight a web use case/code example based on your proposal?


Regards,

Aymeric

Le 11/09/2013 18:14, Takeshi Yoshino a écrit :
I forgot to add an attribute to specify the max size of backing store. 
Maybe it should be added to the constructor.


On Wed, Sep 11, 2013 at 11:24 PM, Takeshi Yoshino tyosh...@google.com 
mailto:tyosh...@google.com wrote:


  any peek(optional [Clamp] long long size, optional [Clamp] long
long offset);


peek with offset doesn't make sense for text mode reading. Some 
exception should be throw for that case.


- readableSize attribute returns (number of readable bytes as of
the last time the event loop started executing a task) - (bytes
consumed by read() method).


+ (bytes added by write() and transferred to read buffer synchronously)



The concept of this interface is
- to allow bulk transfer from internal asynchronous storage (e.g. 
network, disk based backing store) to JS world but delay conversion 
(e.g. into DOMString, ArrayBuffer).

- not to ask an app to do such transfer explicitly



--
jCore
Email :  avi...@jcore.fr
Peersm : http://www.peersm.com
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-09-12 Thread Takeshi Yoshino

On Thu, Sep 12, 2013 at 10:58 PM, Aymeric Vitte vitteayme...@gmail.comwrote:

  Apparently we are not talking about the same thing, while I am thinking
 to a high level interface your interface is taking care of the underlying
 level.


How much low level stuff to expose would basically affect high level
interface design, I think.


 Like node's streams, node had to define it since it was not existing (but
 is someone using node's streams as such or does everybody use

...snip...

 So, to understand where the mismatch comes from, could you please
 highlight a web use case/code example based on your proposal?


I'm still thinking how much we should include in the API, too. This
proposal is just a trial to address the requirements Isaac listed. So, each
feature should correspond to some of his example code.

Re: Overlap between StreamReader and FileReader

2013-09-12 Thread Takeshi Yoshino

On Fri, Sep 13, 2013 at 5:15 AM, Aymeric Vitte vitteayme...@gmail.comwrote:

  Isaac said too So, just to be clear, I'm **not** suggesting that
 browser streams copy Node streams verbatim..


I know. I wanted to kick the discussion which was stopped for 2 weeks.


 Unless you want to do node inside browsers (which would be great but seems
 unlikely) I still don't see the relation between this kind of proposal and
 existing APIs.


What do you mean by existing APIs? I was thinking that we've been
discussing what Stream read/write API for manual consuming/producing by
JavaScript code should be like.


 Could you please give an example very different from the ones I gave
 already?


Sorry, which mail?

One of what I was imaging is protocol parsing. Such as msgpack, protocol
buffer. It's good that ArrayBuffers of exact size is obtained.

OTOH, as someone pointed out, Stream should have some flow control
mechanism not to pull unlimited amount of data from async storage, network,
etc. readableSize in my proposal is an example of how we make the limit
controllable by an app.

We could also depend on the size argument of read() call. But thinking of
protocol parsing again, it's common that we have small fields such as 4, 8,
16 bytes. If read(size) is configured to pull size bytes from async
storage, it's inefficient. Maybe we could have some hard coded limit, e.g.
1MiB and use max(hardCodedLimit, requestedReadSize).

I'm fine with the latter.


 You have reverted to EventTarget too instead of promises, why?


There was no intention to object against use of Promise. Sorry that I
wasn't clear. I'm rather interested in receiving sequence of data as they
become available (corresponds to Jonas's ChunkedData version read methods)
with just one read call. Sorry that I didn't mention explicitly, but
listeners on the proposed API came from ChunkedData object. I thought we
can put them on Stream itself by giving up multiple read scenario.

writeable/readableThreshold can be safely removed from the API if we agree
it's not important. If the threshold stuff are removed, flush() and pull()
will also be removed.

Re: Overlap between StreamReader and FileReader

2013-09-12 Thread Aymeric Vitte

Isaac said too So, just to be clear, I'm *not* suggesting that browser 
streams copy Node streams verbatim..


Unless you want to do node inside browsers (which would be great but 
seems unlikely) I still don't see the relation between this kind of 
proposal and existing APIs.


Could you please give an example very different from the ones I gave 
already?


WebCrypto seems to be waiting for a Streams interface to be able to 
perform simple progressive operations, which have been (unexpectedly) 
removed from the spec, with outstanding features like the stream itself 
being able to predict its end... I don't think it's required and even 
possible, streams inside browsers only need to handle delta data, the 
rest being handled by the APIs using the streams (including end of the 
stream, flow control  co), cf my simple proposal.


You have reverted to EventTarget too instead of promises, why?

Regards

Aymeric

Le 12/09/2013 20:36, Takeshi Yoshino a écrit :
On Thu, Sep 12, 2013 at 10:58 PM, Aymeric Vitte 
vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote:


Apparently we are not talking about the same thing, while I am
thinking to a high level interface your interface is taking care
of the underlying level.


How much low level stuff to expose would basically affect high level 
interface design, I think.


Like node's streams, node had to define it since it was not
existing (but is someone using node's streams as such or does
everybody use

...snip...

So, to understand where the mismatch comes from, could you please
highlight a web use case/code example based on your proposal?


I'm still thinking how much we should include in the API, too. This 
proposal is just a trial to address the requirements Isaac listed. So, 
each feature should correspond to some of his example code.


--
jCore
Email :  avi...@jcore.fr
Peersm : http://www.peersm.com
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-09-11 Thread Takeshi Yoshino

Here's my all-in-one strawman proposal including some new stuff for flow
control. Yes, it's too big, but may be useful for glancing what features
are requested.



enum StreamReadType {
  ,
  arraybuffer,
  text
};

[Constructor(optional DOMString mime, optional [Clamp] long long
writeBufferSize, optional [Clamp] long long readBufferSize)]
interface Stream : EventTarget {
  readonly attribute DOMString type;  // MIME type

  // Writing interfaces

  readonly attribute unsigned long long writableSize;  // Bytes can be
written synchronously
  attribute unsigned long long writeBufferSize;

  attribute EventHandler onwritable;
  attribute unsigned long long writableThreshold;

  attribute EventHandler onpulled;

  attribute EventHandler onreadaborted;

  void write((DOMString or ArrayBufferView or Blob)? data);
  void flush();
  void closeWrite();
  void abortWrite();

  // Reading interfaces

  attribute StreamReadType readType;  // Must not be set after the first
read()
  attribute DOMString readEncoding;

  readonly attribute unsigned long long readableSize;  // Bytes can be read
synchronously
  attribute unsigned long long readBufferSize;

  attribute EventHandler onreadable;
  attribute unsigned long long readableThreshold;

  attribute EventHandler onflush;

  attribute EventHandler onclose;  // Receives clean flag

  any read(optional [Clamp] long long size);
  any peek(optional [Clamp] long long size, optional [Clamp] long long
offset);
  void skip([Clamp] long long size);
  void pull();
  void abortRead();

  // Async interfaces

  attribute EventHandler ondoneasync;  // Receives bytes skipped or Blob or
undefined (when done pipeTo)

  void readAsBlob(optional [Clamp] long long size);
  void longSkip([Clamp] long long size);
  void pipeTo(Stream destination, optional [Clamp] long long size);
};



- Encoding for text mode reading is determined by the type attribute. Can
be overridden by setting the readEncoding attribute.

- Invoking read() repeatedly to pull data into the stream is annoying. So,
instead I used writable/readableThreshold approach.

- Not to bloat the API anymore, limited error/close signaling interface to
only EventHandlers.

- stream.read() means stream.read(stream.readableSize).

- After onclose invocation, it's guaranteed that all written bytes are
available for read.

- read() is non-blocking. It returns only what is synchronously readable.
If there isn't enough bytes (investigate the readableSize attribute), an
app should wait until the next invocation of onreadable. readBufferSize and
readableThreshold may be modified accordingly and pull() may be called.
- stream.read(size) returns an ArrayBuffer or DOMString of min(size,
stream.readableSize) bytes that is synchronously readable now.

- When readType is set to text, read() throws an EncodingError if an
invalid sequence is found. Incomplete sequence will be left unconsumed. If
there's an incomplete sequence at the end of stream, the app can know that
by checking the size attribute after onclose invocation and read() call.

- readableSize attribute returns (number of readable bytes as of the last
time the event loop started executing a task) - (bytes consumed by read()
method).

- onflush is separated from onreadable since it's possible that an
intermediate Stream in a long change has no data to flush but the next or
later Streams have.
- Invocation order is onreadable - onflush or onclose.
- Flush handling code must be implemented on both onflush and onclose. On
close() call, only onclose is invoked to reduce event propagation cost.

- Pass read/writeBufferSize of -1 to constructor or set -1 to
stream.read/writeBufferSize for unlimited buffering.

- Instead of having write(buffer, cb), made write() accept any size of data
regardless of writeBufferSize. XHR should respect writeBufferSize and write
only writableSize bytes of data and set onwritable if necessary and
possibly also set writableThreashold.

- {writable,readable}Threshold are 0 by default meaning that onwritable and
onreadable are invoked immediately when there's space/data available.

- If {writable,readable}Threshold are greater than capacity, it's
considered to be set to capacity.

- onwritable/onreadable is invoked asynchronously when
-- new space/data is available as a result of read()/write() operation that
satisfies writable/readableThreshold
onreadable is invoked asynchronously when
-- flush()-ed or close()-ed

- onwritable/onreadable is invoked synchronously when
-- onwritable/onreadable is updated and there's space/data available that
satisfies writable/readableThreshold
-- writable/readableThreshold is updated and there's space/data available
that satisfies the new writable/readableThreshold
-- new space/data is available as a result of updating capacity that
satisfies writable/readableThreshold

Re: Overlap between StreamReader and FileReader

2013-09-11 Thread Takeshi Yoshino

I forgot to add an attribute to specify the max size of backing store.
Maybe it should be added to the constructor.

On Wed, Sep 11, 2013 at 11:24 PM, Takeshi Yoshino tyosh...@google.comwrote:

   any peek(optional [Clamp] long long size, optional [Clamp] long long
 offset);


peek with offset doesn't make sense for text mode reading. Some exception
should be throw for that case.


 - readableSize attribute returns (number of readable bytes as of the last
 time the event loop started executing a task) - (bytes consumed by read()
 method).


+ (bytes added by write() and transferred to read buffer synchronously)



The concept of this interface is
- to allow bulk transfer from internal asynchronous storage (e.g. network,
disk based backing store) to JS world but delay conversion (e.g. into
DOMString, ArrayBuffer).
- not to ask an app to do such transfer explicitly

Re: Overlap between StreamReader and FileReader

2013-09-10 Thread Takeshi Yoshino

On Fri, Aug 23, 2013 at 2:41 AM, Isaac Schlueter i...@izs.me wrote:

 1. Drop the read n bytes part of the API entirely.  It is hard to do


I'm ok with that. But then, instead we need to evolve ArrayBuffer to have
powerful concat/slice functionality for performance. Re: slicing, we can
just make APIs to accept ArrayBufferView. How should we deal with concat
operation? You suggested that we add unshift(), but repeating read and
unshift until we get enough data sound not so good.

For example, currently TextDecoder (http://encoding.spec.whatwg.org/)
accepts one ArrayBufferView and outputs one DOMString. We can use stream
mode of TextDecoder to get multiple output DOMStrings and then concatenate
them to get the final result.

As we still don't have StringBuilder, it's not considered to be a big deal
to have ArrayBufferBuilder (Stream.read(size) is kinda ArrayBuffer
builder)?

Is any of you guys thinking about introducing something like Node.js's
Buffer class for decoding and tokenization? TextDecoder+Stream would be a
kind of such classes.

I also considered making read() operation to accept pre-allocated
ArrayBuffer and return the number of bytes written.

  stream.read(buffer)

If written data is insufficient, the user can continue to pass the same
buffer to fill the unused space. But, since DOMString is immutable, we
can't take the same approach for readText() op.


 see in Node), and complicates the internal mechanisms.  People think

they need it, but what they really need is readUntil(delimiterChar).


What if implementing length header based protocol, e.g. msgpack?


 2. Reading strings vs ArrayBuffers or other types of things MUST be a

property of the stream,


Fixed property or mutable via readType attribute?

If readType, the sequence of UTF8/binary mixed read() problem remains.


 3. Sync vs async read().  Let's dig into the issue of
 `var d = s.read()` vs `s.read(function(d) {})` for getting data out of
 a stream.

...snip...

 buffering to occur if you have pipe chains of streams that are
 processing at different speeds, where one is bursty and the other is
 consistent.


Clarification. You're saying that always posting cb to task queue is
wasteful. Right?

Anyway, I think it makes sense. If read is designed to invoke cb
synchronously, it'll be difficult to avoid stack overflow. So the only
options is to always run cb in the next task.


 stream.poll(function ondata() {


What happens if unshift() is called? poll() invokes ondata() only when new
data (unshift()-ed data is not included) is available?


   var d = stream.read();
   while (stream.state === 'OK') {
 processData(d);
 d = stream.read();
   }


Is Jonas right about the reason why we need loop here? I.e. to avoid
automatic merge/serialization of buffered chunks?


   switch (stream.state) {
 case 'EOF': onend(); break;
 case 'EWOULDBLOCK': stream.poll(ondata); break;
 default: onerror(new Error('Stream read error: ' + stream.state));


We could distinguish these three states by null, empty
ArrayBuffer/DOMString, and non-empty ArrayBuffer/DOMString?


 ReadableStream.prototype.readAll = function(onerror, ondata, onend) {
   onpoll();
   function onpoll() {


If we decide not to allow multiple concurrent read operations on a stream,
can we just use event handler approach.

stream.onerror = ...
stream.ondata = ...


 4. Passive data listening.  In Node v0.10, it is not possible to
 passively listen to the data passing through a stream without
 affecting the state of the stream.  This is corrected in v0.12, by
 making the read() method also emit a 'data' event whenever it returns
 data, so v0.8-style APIs work as they used to.

 The takeaway here is not to do what Node did, but to learn what Node
 learned: the passive-data-listening use-case is relevant.


What's the use case?


 5. Piping.  It's important to consider how any proposed readable
 stream API will allow one to respond to backpressure, and how it
 relates to a *writable* stream API.  Data management from a source to
 a destination is the fundamental reason d'etre for streams, after all.


I'd have onwritable and onreadable handler, make their threshold
configurable and let pipe to setup them.

Re: Overlap between StreamReader and FileReader

2013-08-22 Thread Jonas Sicking

On Fri, Aug 9, 2013 at 12:47 PM, Isaac Schlueter i...@izs.me wrote:
 Jonas,

 What does *progress* mean here?

 So, you do something like this:

 var p = stream.read()

 to get a promise (of some sort).  That read() operation is (if we're
 talking about TCP or FS) a single operation.  There's no 50% of the
 way done reading moment that you'd care to tap into.

 Even from a conceptual point of view, the data is either:

 a) available (and the promise is now fulfilled)
 b) not yet available (and the promise is not yet fulfilled)
 c) known to *never* be available because:
   i) we've reached the end of the stream (and the promise is fulfilled
 with some sort of EOF sentinel), or
  ii) because an error happened (and the promise is broken).

 So.. where's the progress?  A single read() operation seems like it
 ought to be atomic to me, and indeed, the read[2] function either
 returns some data (a), no data (c-i), raises EWOUDLBLOCK (b), or
 raises some other error (c-ii).  But, whichever of those it does, it
 does right away.  We only get woken up again (via
 epoll/kqueue/CPIO/etc) once we know that the file descriptor (or
 HANDLE in windows) is readable again (and thus, it's worthwhile to
 attempt another read[2] operation).

Hi Isaac,

Sorry for taking so long to respond. It took me a while to understand
where the disconnect came from. I also needed to mull over how a
consumer actually is likely to consume data from a Stream.

Having looked over the Node.js API more I think I see where the
misunderstanding is coming from. The source of confusion is likely
that Node.js and the proposal in [1] are very different.
Specificially, in Node.js the read() operation is synchronous and
operates on currently buffered data. In [1] the read() operation is
asynchronous and isn't restricted to just the currently buffered data.

From my point of view there are two rough categories of ways of
reading data from an asynchronous Stream:

A) The Stream hands data to the consumer as soon as the data is
available. I.e. the stream doesn't buffer data longer than until the
next opportunity to fire a callback to the consumer.
B) The Stream allows the consumer to pull data out of the stream at
whatever pace, and in whatever block size, that the consumer finds
appropriate. If the data isn't yet available a callback is used to
notify the consumer when it is.

A is basically the Stream pushing the data to the consumer. And B is
the consumer pulling the data from the Stream.

In Node.js doing A looks something like:

stream.on('readable', function() {
  var buffer;
  while((buffer = stream.read())) {
processData(buffer);
  }
});

In the proposal in [1] you would do this with the following code

stream.readBinaryChunked().ondata = function(e) {
  processData(e.data);
}

(side-note: it's unclear to me why the Node.js API forces
readable.read() to be called in a loop. Is that to avoid having to
flatten internal buffer fragments? Without that the two would
essentially be the same with some minor syntactical differences)

Here it definitely doesn't make sense to deliver progress
notifications. Rather than delivering a progress notification to the
consumer, you simply deliver the data.

The way you would do B in Node.js looks something like:

stream.on('readable', function() {
  var buffer;
  if ((buffer = stream.read(10))) {
processTenBytes(buffer);
  }
});

The same thing using the proposal in [1] looks like

stream.readBinary(10).then(function(buffer) {
  processTenBytes(buffer);
});

An important difference here is that in the Node.js API, the 'read 10
bytes' operation either immediately returns a result, or it
immediately fails, depending on how much data we currently have
buffered. I.e. the read() call is synchronous. The caller is expected
to keep calling read(10) until the call succeeds. Though of course
there's also a very useful callback which makes the calling again very
easy. But between the calls to read() the Stream doesn't really have
knowledge that someone is waiting to read 10 bytes of information.

The API in [1] instead makes the read() call asynchronous. That means
that we can always let the call succeed (unless there's an error on
the stream of course). If we don't have enough data buffered
currently, we simply call the success callback later than if we had
had all requested data buffered already.

This is the place where delivering progress notifications could also
be done, though by no means this is an important aspect of the API.
But since the read() operation is asynchronous, we can deliver
progress notifications as we buffer up enough data to fulfill it. I
hope that makes it more clear how progress notifications play in.

So to be clear, progress notifications is by no means the important
difference here. The important difference is whether we make read() be
synchronous and operating on the current buffered data, or if we make
it asynchronous and operating on the full data stream.


As far as I can tell

Re: Overlap between StreamReader and FileReader

2013-08-22 Thread Jonas Sicking

On Fri, Aug 9, 2013 at 7:36 PM, Domenic Denicola
dome...@domenicdenicola.com wrote:
 Another way of looking at it, is that a streaming API is itself incremental 
 and cancellable. It makes no sense to say that each read from or write to the 
 stream is *also* incremental and cancellable; why introduce another layer of 
 entirely-unnecessary depth before you reach the atomic level of 
 non-incremental, non-cancellable reads/writes? What use case does that serve?

I'm pretty sold on the argument that making individual reads
cancellable is a bad idea.

But note that the original proposal does not make individual reads
incremental. Progress events is not the same thing as incremental. I
really think talking about progress notifications being or not being
is focusing on the wrong questions. Nothing would substantially change
if we made the original proposal return Promises rather than
ProgressPromises.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-08-22 Thread Aymeric Vitte




Le 22/08/2013 09:28, Jonas Sicking a écrit :

Does anyone have examples of code that uses the Node.js API? I'd love
to look at how people practically end up consuming data?

I am doing something like this:

var parse=function() {
//process this.stream_
this.queue_.shift();
if (this.queue_.length) {
this.queue_[0]();
};
};
var process=function(data) {
return function() {
this.stream_=[this.stream_,data].concatBuffers();
parse.call(this);
};
};
var on_data=function(data) {
this.queue_=this.queue_||[];
this.queue_.push(process(data).bind(this));
if (this.queue_.length===1) {
this.queue_[0]();
};
};
request.on('data',function(data) {
on_data.call(this,data);
});

I don't remember exactly if it's due to my implementation or node 
(because I am using both node's Buffers and Typed Arrays) but I 
experienced some problems where data was modified while it was being 
processed, that's why this.stream_ is freezing the data received (with 
remaining bytes received earlier, see next sentence) until it is processed.


Coming back to my previous TextEncoder/Decoder remark for utf-8 parsing, 
I don't know how to do this with native node functions.


Regards

Aymeric

--
jCore
Email :  avi...@jcore.fr
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-08-22 Thread Isaac Schlueter

So, just to be clear, I'm *not* suggesting that browser streams copy
Node streams verbatim.


In Node.js doing A looks something like:

stream.on('readable', function() {
  var buffer;
  while((buffer = stream.read())) {
processData(buffer);
  }
});


Not quite.  In Node.js, doing A looks like:

stream.on('data', processData);


In my opinion, marrying a browser stream implementation to an
EventEmitter abstraction would be a mistake.  I also think that
marrying it to a Promise implementation would be a mistake.  As
popular as Promises are, they are an additional layer of abstraction
that is not fundamentally related to streaming data, and it is trivial
to turn:

obj.method(fn)

into:

obj.method().then(fn)

at the user/library level.  This allows performance-critical
applications to avoid any unnecessary complexity and shorten their
code paths as much as possible, but is easily extended for those who
prefer promises (or generators, or coroutines, or what have you.)
 Despite what you may see on twitter or mailing lists, the choice to
use this minimal abstraction for Node's asynchrony has allowed all
these different things to coexist rather peacefully, and I believe
that it is a great success.

Even if you feel that promises or generators are the best thing since
generational garbage collection (and certainly, both have their
merits), I think it is worth exploring where such a constraint would
lead us.



So far in this conversation, I've been mostly just trying to figure
out what the state of things is, and pointing out what I see as
potential hazards.  Here are some pro-active suggestions, but this is
still not anything I'm particularly in love with, so treat it only
as an exploration of the problem space.


1. Drop the read n bytes part of the API entirely.  It is hard to do
in a way that makes sense for both binary and string streams (as we
see in Node), and complicates the internal mechanisms.  People think
they need it, but what they really need is readUntil(delimiterChar).
 And, that's trivial to implement on top of unshift().  So let's just
add unshift(chunk).


2. Reading strings vs ArrayBuffers or other types of things MUST be a
property of the stream, not of the read() call.  Having readBinary()
and readUtf8(), or read(encoding), is a terrible idea which bloats the
API surface and exposes multibyte landmines.  The easiest way to do
this is to make the API agnostic as to the specific data type
returned.  If we ditch read(number of bytes), then this becomes much
simpler, and also allows for things like streaming JSON parsers that
return arbitrary JavaScript objects.


3. Sync vs async read().  Let's dig into the issue of
`var d = s.read()` vs `s.read(function(d) {})` for getting data out of
a stream.

The problem is that this assumes that each call to read() will be done
when there is no data buffered, and will result in a call to the
underlying system, requiring some async stuff.  However, that's not
always the case, and a lot of times, you actually want a bit of
buffering to occur if you have pipe chains of streams that are
processing at different speeds, where one is bursty and the other is
consistent.

For example, consider a situation where you're interacting with a
local database, and also a 3g network connection. The 3g connection
will be either very fast or completely not moving, and the local
database will be relatively stable.  If you're reading the data from
the network connection, and putting it into the db, you don't want to
pause the 3g connection unnecessarily and miss a potential burst just
because you were waiting for the db.  You *also* don't want those
bursts to overwhelm your buffer, of course.  The solution for this is
to have some pre-defined buffer in the stream implementation, so that
you only pause the bursty stream if the slow-and-steady stream can't
keep up.

If you have a readable stream which is buffering its data in memory,
then doing `s.read(cb)` is always going to be strictly more expensive
than doing `var d = s.read()`.

The only way to make an async read not more expensive than a sync
returning read is for the callback to be inline-able, and called
immediately.  However, this means that it is no longer possible to
reason about any particular read() call, and so this releases Zalgo.

For example:

console.log('before');
stream.read(function(data) { console.log('got data') });
console.log('after');

The ordering of logs must be predictable, which means that we must
*always* defer the callback's execution until at least the end of the
current run-to-completion.  This isn't free.

This problem could potentially be solved if we used synchronous reads,
but mirrored the epoll-like behavior more closely than Node.js does
today, without the read(n) mistake.

```
stream.poll(function ondata() {
  var d = stream.read();
  while (stream.state === 'OK') {
processData(d);
d = stream.read();
  }
  switch (stream.state) {
case 'EOF':

Re: Overlap between StreamReader and FileReader

2013-08-09 Thread Isaac Schlueter

On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote:
 I believe the term is congestion control such as the TCP congestion
 control algorithm.

As I've heard the term used, congestion control is slightly
different than flow control or tcp backpressure, but they are
related concepts, and yes, your point is dead-on, Austin, this is
absolutely 100% essential.  Any Stream API that treats backpressure as
an issue to handle later is not a Stream API, and is clearly not ready
to even bother discussing.

On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote:
 I think there's some confusion as to what the abort() call is going to do
 exactly.

Yeah, I'm rather confused by that as well.  A read[2] operation
typically can't be canceled because it's synchronous.


Let's back up just a step here, and talk about the fundamental purpose
of an API like this.  Here's a strawman:


-
A Readable Stream is an abstraction representing an ordered set of
data which may or may be finite, some or all of which may arrive at a
future time, which can be consumed at any arbitrary rate up to the
rate at which data is arriving, without causing excessive memory use.
It provides a mechanism to send the data into a Writable Stream, and
for being alerted to errors in the underlying implementation.

A Writable Stream is an abstraction representing a destination where
data is written, where any given write operation may be completely
flushed to the underlying implementation immediately or at some point
in the future.  It provides a mechanism for determining when more data
can be safely written without causing excessive memory usage, and for
being alerted to errors in the underlying implementation.

A Duplex Stream is an abstraction that implements both the Readable
Stream and Writable Stream interfaces.  There may or may not be any
specific connection between the two sets of functionality.  (For
example, it may represent a tcp socket file descriptor, or any
arbitrary readable/writable API that one can imagine.)
-


For any stream implementation, I typically try to ask: How would you
build a non-blocking TCP implementation using this abstraction?  This
might just be my bias coming from Node.js, but I think it's a fair
test of a Stream API that will be used on the web, where TCP is the
standard.  Here are some things that need to work 100%, assuming a
Readable.pipe(Writable) method:

fastReader.pipe(slowWriter)
slowReader.pipe(fastWriter)
socket.pipe(socket) // echo server
socket.pipe(new gzipDeflate()).pipe(socket)
socket.pipe(new gzipInflate()).pipe(socket)



Node's streams, as of 0.11.5, are pretty good.  However, they've
evolved rather than having been intelligently designed, so in many
areas, the API surface is not as elegant as it could be.  In
particular, I think that relying on an EventEmitter interface is an
unfortunate choice that should not be repeated in this specification.
The language has new features, and Promises are somewhat
well-understood now (and weren't as much then).  But Node streams have
definitely got a lot of play-testing that we can lean on when
designing something better.

Calling read() repeatedly is much less convenient than doing something
like `stream.on('data', doSomething)`.  Additionally, you often want
to spy on a Stream, and get access to its data chunks as they come
in, without being the main consumer of the Stream.

Re: Overlap between StreamReader and FileReader

2013-08-09 Thread Jonas Sicking

On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote:
 On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola
 dome...@domenicdenicola.com wrote:
  From: Takeshi Yoshino [mailto:tyosh...@google.com]
 
  On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola
  dome...@domenicdenicola.com wrote:
  Hey all, I was directed here by Anne helpfully posting to
  public-script-coord and es-discuss. I would love it if a summary of what
  proposal is currently under discussion: is it [1]? Or maybe some form of
  [2]?
 
  [1]: https://rawgithub.com/tyoshino/stream/master/streams.html
  [2]:
  http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html
 
  I'm drafting [1] based on [2] and summarizing comments on this list in
  order to build up concrete algorithm and get consensus on it.
 
  Great! Can you explain why this needs to return an
  AbortableProgressPromise, instead of simply a Promise? All existing stream
  APIs (as prototyped in Node.js and in other environments, such as in
  js-git's multi-platform implementation) do not signal progress or allow
  aborting at the during a chunk level, but instead count on you recording
  progress by yourself depending on what you've seen come in so far, and
  aborting on your own between chunks. This allows better pipelining and
  backpressure down to the network and file descriptor layer, from what I
  understand.

 Can you explain what you mean by This allows better pipelining and
 backpressure down to the network and file descriptor layer?


 I believe the term is congestion control such as the TCP congestion
 control algorithm. That is, don't send data to the application faster than
 it can parse it or pass it off, or otherwise some mechanism to allow the
 application to throttle down the incoming flow, essential to any networked
 application like the Web.

I don't think that congestion control is affected by progress
notifications at all. And it is certainly not affected by if the
progress notifications fire from the Promise object or from another
object.

Progress notifications doesn't affect when or how data is being read.
It only tells you about the reads that other APIs are doing.

 I think there's some confusion as to what the abort() call is going to do
 exactly.

This is a good question. I.e. does calling abort() on a Promise
returned from Stream.read() only cancel that read, or does it also
cancel the whole Stream?

I could definitely see that as an argument for returning
ProgressPromise rather than AbortableProgressPormise from
Stream.read() and instead sticking an abort() function on Stream.

In any case, this seems like an orthogonal issue to progress
notifications being or not being.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-08-09 Thread Isaac Schlueter

Jonas,

What does *progress* mean here?

So, you do something like this:

var p = stream.read()

to get a promise (of some sort).  That read() operation is (if we're
talking about TCP or FS) a single operation.  There's no 50% of the
way done reading moment that you'd care to tap into.

Even from a conceptual point of view, the data is either:

a) available (and the promise is now fulfilled)
b) not yet available (and the promise is not yet fulfilled)
c) known to *never* be available because:
  i) we've reached the end of the stream (and the promise is fulfilled
with some sort of EOF sentinel), or
 ii) because an error happened (and the promise is broken).

So.. where's the progress?  A single read() operation seems like it
ought to be atomic to me, and indeed, the read[2] function either
returns some data (a), no data (c-i), raises EWOUDLBLOCK (b), or
raises some other error (c-ii).  But, whichever of those it does, it
does right away.  We only get woken up again (via
epoll/kqueue/CPIO/etc) once we know that the file descriptor (or
HANDLE in windows) is readable again (and thus, it's worthwhile to
attempt another read[2] operation).

Now, it *might* make sense to say that the entire Stream as a whole is
a ProgressPromise of sorts.  But, since you often don't know the
eventual size of the data ahead of time (and indeed, it will often be
unbounded), progress is an odd concept in this context.

Are you proposing that every step in the TCP dance is somehow exposed
on promise returned by read()?  That seems rather inconvenient and
unnecessary, not to mention difficult to implement, since the TCP
stack is typically in kernel space.



On Fri, Aug 9, 2013 at 11:45 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote:
 On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola
 dome...@domenicdenicola.com wrote:
  From: Takeshi Yoshino [mailto:tyosh...@google.com]
 
  On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola
  dome...@domenicdenicola.com wrote:
  Hey all, I was directed here by Anne helpfully posting to
  public-script-coord and es-discuss. I would love it if a summary of what
  proposal is currently under discussion: is it [1]? Or maybe some form of
  [2]?
 
  [1]: https://rawgithub.com/tyoshino/stream/master/streams.html
  [2]:
  http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html
 
  I'm drafting [1] based on [2] and summarizing comments on this list in
  order to build up concrete algorithm and get consensus on it.
 
  Great! Can you explain why this needs to return an
  AbortableProgressPromise, instead of simply a Promise? All existing stream
  APIs (as prototyped in Node.js and in other environments, such as in
  js-git's multi-platform implementation) do not signal progress or allow
  aborting at the during a chunk level, but instead count on you recording
  progress by yourself depending on what you've seen come in so far, and
  aborting on your own between chunks. This allows better pipelining and
  backpressure down to the network and file descriptor layer, from what I
  understand.

 Can you explain what you mean by This allows better pipelining and
 backpressure down to the network and file descriptor layer?


 I believe the term is congestion control such as the TCP congestion
 control algorithm. That is, don't send data to the application faster than
 it can parse it or pass it off, or otherwise some mechanism to allow the
 application to throttle down the incoming flow, essential to any networked
 application like the Web.

 I don't think that congestion control is affected by progress
 notifications at all. And it is certainly not affected by if the
 progress notifications fire from the Promise object or from another
 object.

 Progress notifications doesn't affect when or how data is being read.
 It only tells you about the reads that other APIs are doing.

 I think there's some confusion as to what the abort() call is going to do
 exactly.

 This is a good question. I.e. does calling abort() on a Promise
 returned from Stream.read() only cancel that read, or does it also
 cancel the whole Stream?

 I could definitely see that as an argument for returning
 ProgressPromise rather than AbortableProgressPormise from
 Stream.read() and instead sticking an abort() function on Stream.

 In any case, this seems like an orthogonal issue to progress
 notifications being or not being.

 / Jonas

RE: Overlap between StreamReader and FileReader

2013-08-09 Thread Domenic Denicola

Isaac has essentially explained what I was getting at earlier, except much more 
clearly. When I said this allows better pipelining and backpressure down to 
the network and file descriptor layer, I was essentially saying implementing 
read or write operations as cancellable and incremental does not fit well with 
making them atomic operations that can fit into architecture of streams with 
flow-control. And, as Isaac again eloquently pointed out, streams without flow 
control are not streams at all. (You're Missing the Point of Streams, anyone? 
:P)

Another way of looking at it, is that a streaming API is itself incremental and 
cancellable. It makes no sense to say that each read from or write to the 
stream is *also* incremental and cancellable; why introduce another layer of 
entirely-unnecessary depth before you reach the atomic level of 
non-incremental, non-cancellable reads/writes? What use case does that serve?

RE: Overlap between StreamReader and FileReader

2013-08-08 Thread Domenic Denicola

From: Takeshi Yoshino [mailto:tyosh...@google.com] 

 On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola 
 dome...@domenicdenicola.com wrote:
 Hey all, I was directed here by Anne helpfully posting to 
 public-script-coord and es-discuss. I would love it if a summary of what 
 proposal is currently under discussion: is it [1]? Or maybe some form of [2]?

 [1]: https://rawgithub.com/tyoshino/stream/master/streams.html
 [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html

 I'm drafting [1] based on [2] and summarizing comments on this list in order 
 to build up concrete algorithm and get consensus on it.
 
Great! Can you explain why this needs to return an AbortableProgressPromise, 
instead of simply a Promise? All existing stream APIs (as prototyped in Node.js 
and in other environments, such as in js-git's multi-platform implementation) 
do not signal progress or allow aborting at the during a chunk level, but 
instead count on you recording progress by yourself depending on what you've 
seen come in so far, and aborting on your own between chunks. This allows 
better pipelining and backpressure down to the network and file descriptor 
layer, from what I understand.

RE: Overlap between StreamReader and FileReader

2013-08-08 Thread Domenic Denicola

From: Takeshi Yoshino [tyosh...@google.com]

 Sorry, which one? stream.Readable's readable event and read method?

Exactly.

 I agree flow control is an issue not addressed well yet and needs to be fixed.

I would definitely suggest thinking about it as soon as possible, since it will 
likely have a significant effect on the overall API. For example, all this talk 
of standardizing ProgressPromise (much less AbortableProgressPromise) will 
likely fall by the wayside once you consider how it hurts flow control.

Re: Overlap between StreamReader and FileReader

2013-08-08 Thread Jonas Sicking

On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola
dome...@domenicdenicola.com wrote:
 From: Takeshi Yoshino [mailto:tyosh...@google.com]

 On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola 
 dome...@domenicdenicola.com wrote:
 Hey all, I was directed here by Anne helpfully posting to 
 public-script-coord and es-discuss. I would love it if a summary of what 
 proposal is currently under discussion: is it [1]? Or maybe some form of 
 [2]?

 [1]: https://rawgithub.com/tyoshino/stream/master/streams.html
 [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html

 I'm drafting [1] based on [2] and summarizing comments on this list in order 
 to build up concrete algorithm and get consensus on it.

 Great! Can you explain why this needs to return an AbortableProgressPromise, 
 instead of simply a Promise? All existing stream APIs (as prototyped in 
 Node.js and in other environments, such as in js-git's multi-platform 
 implementation) do not signal progress or allow aborting at the during a 
 chunk level, but instead count on you recording progress by yourself 
 depending on what you've seen come in so far, and aborting on your own 
 between chunks. This allows better pipelining and backpressure down to the 
 network and file descriptor layer, from what I understand.

Can you explain what you mean by This allows better pipelining and
backpressure down to the network and file descriptor layer?

I definitely agree that we don't want to cause too large performance
overheads. But it's not obvious to me how performance is affected by
putting progress and/or aborting functionality on the returned Promise
instance, rather than on a separate object (which you suggested in
another thread).

We should absolutely learn from Node.js and other environments. Do you
have any pointers to discussions about why they didn't end up with
progress in their read a chunk API?

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-08-08 Thread Austin William Wright

On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola
 dome...@domenicdenicola.com wrote:
  From: Takeshi Yoshino [mailto:tyosh...@google.com]
 
  On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola 
 dome...@domenicdenicola.com wrote:
  Hey all, I was directed here by Anne helpfully posting to
 public-script-coord and es-discuss. I would love it if a summary of what
 proposal is currently under discussion: is it [1]? Or maybe some form of
 [2]?
 
  [1]: https://rawgithub.com/tyoshino/stream/master/streams.html
  [2]:
 http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html
 
  I'm drafting [1] based on [2] and summarizing comments on this list in
 order to build up concrete algorithm and get consensus on it.
 
  Great! Can you explain why this needs to return an
 AbortableProgressPromise, instead of simply a Promise? All existing stream
 APIs (as prototyped in Node.js and in other environments, such as in
 js-git's multi-platform implementation) do not signal progress or allow
 aborting at the during a chunk level, but instead count on you recording
 progress by yourself depending on what you've seen come in so far, and
 aborting on your own between chunks. This allows better pipelining and
 backpressure down to the network and file descriptor layer, from what I
 understand.

 Can you explain what you mean by This allows better pipelining and
 backpressure down to the network and file descriptor layer?


I believe the term is congestion control such as the TCP congestion
control algorithm. That is, don't send data to the application faster than
it can parse it or pass it off, or otherwise some mechanism to allow the
application to throttle down the incoming flow, essential to any
networked application like the Web.

I think there's some confusion as to what the abort() call is going to do
exactly.

Re: Overlap between StreamReader and FileReader

2013-07-31 Thread Jonas Sicking

On Tue, Jul 30, 2013 at 10:27 PM, Takeshi Yoshino tyosh...@google.com wrote:
 On Tue, Jul 30, 2013 at 12:07 PM, Jonas Sicking jo...@sicking.cc wrote:

  could contain an encoding. Then when stream.readText is called, if
  there's an explicit encoding, it would use that encoding when
 
 
  Do you think it should also have overrideMimeType like XHR?

 I think that use case is rare enough that we can solve it by letting the
 author create a new Stream object, which presumably would allow specifying a
 type for that stream, and then feed that new stream the contents of the old
 stream.

 OK.

 One question about readText is what size should mean and how to handle
 incomplete chunk.

 a) maxSize means the size of DOMString. readText reads data until it builds
 DOMString of maxSize or reached EoF
 b) maxSize means the size of raw bytes
 b-i) buffer incomplete bytes for next read
 b-ii) fail if decoder didn't consume all read data (of maxSize bytes)

 b-ii) is simple but inconvenient. user need to know the number of bytes the
 next text data occupies in advance

 maybe b-i) and in case read() is issued after incomplete readText, an
 exception should be thrown. This is kind of mutual exclusiveness Anne was
 worrying about.

This is an excellent question. I have the same reaction regarding b-ii
and b-i. And I'd also lean towards b-i with the caveat that you are
raising.

Another issue is that it would be great to support reading null
terminated strings. I.e. rather than reading a particular size (binary
or decoded size), being able to read until a null terminator is
consumed. That seems like something that is likely to come up.

I do agree that having a single stream type which represents both
binary streams and text streams does make things more painful.
Specifically it makes the b-i solution above more painful.

However having separate types for binary and text streams also creates
problems. Specifically it makes it a lot harder to parse a data format
which contains combination of text and binary data. I don't see a good
solution to supporting that without bringing in worse issues than the
b-i issue above.

So I'm personally still leaning towards sticking both binary and text
support on the same Stream interface. And then using b-i above.
Reading until null termination can probably wait for now.

But I'd be interested to hear counter proposals.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-31 Thread Domenic Denicola

Hey all, I was directed here by Anne helpfully posting to public-script-coord 
and es-discuss. I would love it if a summary of what proposal is currently 
under discussion: is it [1]? Or maybe some form of [2]?

[1]: https://rawgithub.com/tyoshino/stream/master/streams.html
[2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html

RE: Overlap between StreamReader and FileReader

2013-07-31 Thread Domenic Denicola

From: Anne van Kesteren [ann...@annevk.nl]

 Stream.prototype.readType takes an enumerated string value which is 
 arraybuffer (default) or text.

 Stream.prototype.read returns a promise fulfilled with the type of value 
 requested.


I believe this is somewhat similar to how Node streams have settled. Their API 
is that you call `stream.setEncoding('utf-8')` and then calling `.read(n)` will 
return a string of at most n characters. By default, there is no encoding set, 
and calling `.read(n)` will return n bytes in a buffer.

In this way, the encoding is a stateful aspect of the stream itself. I don't 
think there's a way to get around this, without ending up with dangling 
half-character bytes hanging around.

Re: Overlap between StreamReader and FileReader

2013-07-31 Thread Anne van Kesteren

On Wed, Jul 31, 2013 at 5:03 PM, Domenic Denicola
dome...@domenicdenicola.com wrote:
 In this way, the encoding is a stateful aspect of the stream itself. I don't 
 think there's a way to get around this, without ending up with dangling 
 half-character bytes hanging around.

It seems though that if you can change the way bytes are consumed
while reading a stream you will end up with problematic scenarios.
E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to
reading code points... Instantiating a ByteStream or TextStream in
advance would address that.


-- 
http://annevankesteren.nl/

RE: Overlap between StreamReader and FileReader

2013-07-31 Thread Domenic Denicola

From: Anne van Kesteren [ann...@annevk.nl]

 It seems though that if you can change the way bytes are consumed while 
 reading a stream you will end up with problematic scenarios. E.g. you consume 
 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... 
 Instantiating a ByteStream or TextStream in advance would address that.

Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas 
earlier wanted to be able to do both binary and text in the same stream (did he 
have a specific use case?), and presumably that motivated Node's design as well.

I guess you can just say that if you're in binary mode, you should know what 
you're doing, and know precisely when is the correct time to switch to string 
mode. If you switch in the middle of a four-byte sequence, you presumably meant 
to do so, and deserve to get back the mangled characters that result.

To make this work might require some kind of put the bytes back primitive, to 
avoid a situation where you read too far in binary mode and want to back up a 
bit before you engage string mode. I guess this is Node.js's [unshift][1].

It would be cool to avoid all this though and just read either bytes or 
strings, without allowing switching. (Maybe, feed the byte stream into a string 
decoder transform, and get back a string stream?)

[1]: http://nodejs.org/api/stream.html#stream_readable_unshift_chunk

Re: Overlap between StreamReader and FileReader

2013-07-31 Thread Jonas Sicking

On Wed, Jul 31, 2013 at 10:17 AM, Domenic Denicola
dome...@domenicdenicola.com wrote:
 From: Anne van Kesteren [ann...@annevk.nl]

 It seems though that if you can change the way bytes are consumed while 
 reading a stream you will end up with problematic scenarios. E.g. you 
 consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code 
 points... Instantiating a ByteStream or TextStream in advance would address 
 that.

 Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas 
 earlier wanted to be able to do both binary and text in the same stream (did 
 he have a specific use case?), and presumably that motivated Node's design as 
 well.

I don't have very concrete use-cases in mind. But basically
consumption of any format that contains both textual and binary data.
If we don't think the world contains enough such formats to worry
about, then maybe my use case isn't strong enough.

I think both pdf and various microsoft document formats fall into this
category though.

 I guess you can just say that if you're in binary mode, you should know what 
 you're doing, and know precisely when is the correct time to switch to string 
 mode. If you switch in the middle of a four-byte sequence, you presumably 
 meant to do so, and deserve to get back the mangled characters that result.

 To make this work might require some kind of put the bytes back primitive, 
 to avoid a situation where you read too far in binary mode and want to back 
 up a bit before you engage string mode. I guess this is Node.js's 
 [unshift][1].

Note that the read too far issue isn't text specific. When consuming
any format which uses a terminator (null or any more complicated
pattern) you will have to consume in minimal chunks, often
byte-by-byte, to make sure you don't go past that terminator.

 It would be cool to avoid all this though and just read either bytes or 
 strings, without allowing switching. (Maybe, feed the byte stream into a 
 string decoder transform, and get back a string stream?)

Being able to convert between text and binary streams do work well
when the whole stream is either textual or binary. It's not clear to
me how to do it if you are dealing with a stream that contains both.
Though I'd be interested to see proposals.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-31 Thread Aymeric Vitte

I read quickly the thread but it seems like this is exactly the issue I 
had doing [1].


The use case was just decoding utf-8 html chunked buffers and modifying 
the content on the fly to stream it somewhere else.


It had to work inside browsers and with node (which as far as I know 
does not handle correctly this case, but I did not check latest evolutions)


The solution was [2], TextEncoder/Decoder with a super usefull streaming 
option.


[1] https://www.github.com/Ayms/node-Tor
[2] http://code.google.com/p/stringencoding/

Regards

Aymeric

Le 31/07/2013 21:20, Jonas Sicking a écrit :

On Wed, Jul 31, 2013 at 10:17 AM, Domenic Denicola
dome...@domenicdenicola.com wrote:

From: Anne van Kesteren [ann...@annevk.nl]


It seems though that if you can change the way bytes are consumed while reading 
a stream you will end up with problematic scenarios. E.g. you consume 2 bytes 
of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating 
a ByteStream or TextStream in advance would address that.

Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas 
earlier wanted to be able to do both binary and text in the same stream (did he 
have a specific use case?), and presumably that motivated Node's design as well.

I don't have very concrete use-cases in mind. But basically
consumption of any format that contains both textual and binary data.
If we don't think the world contains enough such formats to worry
about, then maybe my use case isn't strong enough.

I think both pdf and various microsoft document formats fall into this
category though.


I guess you can just say that if you're in binary mode, you should know what 
you're doing, and know precisely when is the correct time to switch to string 
mode. If you switch in the middle of a four-byte sequence, you presumably meant 
to do so, and deserve to get back the mangled characters that result.

To make this work might require some kind of put the bytes back primitive, to avoid a 
situation where you read too far in binary mode and want to back up a bit before you 
engage string mode. I guess this is Node.js's [unshift][1].

Note that the read too far issue isn't text specific. When consuming
any format which uses a terminator (null or any more complicated
pattern) you will have to consume in minimal chunks, often
byte-by-byte, to make sure you don't go past that terminator.


It would be cool to avoid all this though and just read either bytes or 
strings, without allowing switching. (Maybe, feed the byte stream into a string 
decoder transform, and get back a string stream?)

Being able to convert between text and binary streams do work well
when the whole stream is either textual or binary. It's not clear to
me how to do it if you are dealing with a stream that contains both.
Though I'd be interested to see proposals.

/ Jonas



--
jCore
Email :  avi...@jcore.fr
iAnonym : http://www.ianonym.com
node-Tor : https://www.github.com/Ayms/node-Tor
GitHub : https://www.github.com/Ayms
Web :www.jcore.fr
Extract Widget Mobile : www.extractwidget.com
BlimpMe! : www.blimpme.com

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Jonas Sicking

Couldn't we simply let the Stream class have a content type, which
could contain an encoding. Then when stream.readText is called, if
there's an explicit encoding, it would use that encoding when
converting to text.

/ Jonas

On Mon, Jul 29, 2013 at 6:38 AM, Takeshi Yoshino tyosh...@google.com wrote:
 On Thu, Jul 18, 2013 at 7:22 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Jul 17, 2013 at 11:46 AM, Anne van Kesteren ann...@annevk.nl
 wrote:
  On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc
  wrote:
  What do you mean by such features? Are you saying that a Stream zip
  decompressor should be responsible for both decompressing as well as
  binary-text conversion? And thus output something other than a
  Stream?
 
  I meant that for specialized processing you'd likely want more than
  just decoding. You mentioned HTML parsing which requires a fair amount
  more.

 I don't think you want a HTML parser to do both decoding and parsing.
 That would result in a lot of code duplication in each component that
 are dealing with textual formats.

  And if it's just decoding, we could extend
  TextEncoder/TextDecoder to work with Stream objects.

 Sure, we can do that. The question is, what is the output from the
 TextDecoder if you pass it a Stream? A new TextStream type? Is that
 really better than adding the text-consuming functions to Stream?


 We could introduce interfaces TextStream (readAsText) and BinaryStream
 (readAsArrayBuffer) just representing as what type data can be consumed from
 it.

 But for convenience, I'd like to have output of XHR to have both. Stream
 should carry raw binary, charset and MIME, and present them in convenient
 form (methods) to user.

 We can define convenience classes like this more generally.
 - TextStreamWithOptinalTextEncoder
 - BinaryStreamWithOptionalTextDecoder
 There's either raw binary or text data behind it and does decoding/encoding
 when necessary. What we're currently calling Stream and going to use for
 XHR is BinaryStreamWithOptionalTextDecoder.

 TextEncoder may be defined to accept TextStream and output a BinaryStream.
 TextDecoder may be defined to accept BinaryStream and output a TextStream.

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Anne van Kesteren

On Mon, Jul 29, 2013 at 1:16 PM, Jonas Sicking jo...@sicking.cc wrote:
 Couldn't we simply let the Stream class have a content type, which
 could contain an encoding. Then when stream.readText is called, if
 there's an explicit encoding, it would use that encoding when
 converting to text.

How about we use what XMLHttpRequest and WebSocket have?

Stream.prototype.readType takes an enumerated string value which is
arraybuffer (default) or text.

Stream.prototype.read returns a promise fulfilled with the type of
value requested.


-- 
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Jonas Sicking

On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote:
 On Mon, Jul 29, 2013 at 1:16 PM, Jonas Sicking jo...@sicking.cc wrote:
 Couldn't we simply let the Stream class have a content type, which
 could contain an encoding. Then when stream.readText is called, if
 there's an explicit encoding, it would use that encoding when
 converting to text.

 How about we use what XMLHttpRequest and WebSocket have?

 Stream.prototype.readType takes an enumerated string value which is
 arraybuffer (default) or text.

 Stream.prototype.read returns a promise fulfilled with the type of
 value requested.

I'm not sure that comparisons with XHR really works since
XHR.responseType affects the parsing behavior, not the decoding
behavior.

And with WebSocket what you control isn't the result of an operation,
but rather the contents of future events. So additional arguments or
separate signatures isn't really an option there.

I still think that your proposal works. But I don't quite see the
advantage of it. Seems like it simply breaks out one of the arguments
from the read function and passes it through state.

Is the problem you are trying to solve having shorter function names?

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Anne van Kesteren

On Mon, Jul 29, 2013 at 4:13 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote:
 How about we use what XMLHttpRequest and WebSocket have?

 Stream.prototype.readType takes an enumerated string value which is
 arraybuffer (default) or text.

 Stream.prototype.read returns a promise fulfilled with the type of
 value requested.

 I'm not sure that comparisons with XHR really works since
 XHR.responseType affects the parsing behavior, not the decoding
 behavior.

 And with WebSocket what you control isn't the result of an operation,
 but rather the contents of future events. So additional arguments or
 separate signatures isn't really an option there.

 I still think that your proposal works. But I don't quite see the
 advantage of it. Seems like it simply breaks out one of the arguments
 from the read function and passes it through state.

 Is the problem you are trying to solve having shorter function names?

I'm not a big fan of having mutually exclusive accessors for data and
passing it as an argument could work too, but given that you want to
read multiple times that does not seem super convenient.


-- 
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Jonas Sicking

On Mon, Jul 29, 2013 at 5:37 PM, Anne van Kesteren ann...@annevk.nl wrote:
 On Mon, Jul 29, 2013 at 4:13 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote:
 How about we use what XMLHttpRequest and WebSocket have?

 Stream.prototype.readType takes an enumerated string value which is
 arraybuffer (default) or text.

 Stream.prototype.read returns a promise fulfilled with the type of
 value requested.

 I'm not sure that comparisons with XHR really works since
 XHR.responseType affects the parsing behavior, not the decoding
 behavior.

 And with WebSocket what you control isn't the result of an operation,
 but rather the contents of future events. So additional arguments or
 separate signatures isn't really an option there.

 I still think that your proposal works. But I don't quite see the
 advantage of it. Seems like it simply breaks out one of the arguments
 from the read function and passes it through state.

 Is the problem you are trying to solve having shorter function names?

 I'm not a big fan of having mutually exclusive accessors for data and
 passing it as an argument could work too, but given that you want to
 read multiple times that does not seem super convenient.

I'm not sure that there's anything mutually exclusive here? Other
than that data that .read has consumed can't be consumed by .readText.
But that's an effect of that .read/.readText throws away the data
after it has been consumed, rather than an effect of that we have two
different ways of consuming data. I.e. .readText is as exclusive to
.read, as .read is to itself.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-29 Thread Jonas Sicking

On Jul 29, 2013 7:53 PM, Takeshi Yoshino tyosh...@google.com wrote:

 On Tue, Jul 30, 2013 at 5:16 AM, Jonas Sicking jo...@sicking.cc wrote:

 Couldn't we simply let the Stream class have a content type, which


 That's what I meant. In Feras's proposal Stream has type attribute. I
copied it to my draft. read(As)Text would use it.

Sounds good to me.

 could contain an encoding. Then when stream.readText is called, if
 there's an explicit encoding, it would use that encoding when


 Do you think it should also have overrideMimeType like XHR?

I think that use case is rare enough that we can solve it by letting the
author create a new Stream object, which presumably would allow specifying
a type for that stream, and then feed that new stream the contents of the
old stream.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-17 Thread Jonas Sicking

On Wed, Jul 10, 2013 at 7:02 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Tue, Jul 2, 2013 at 12:21 AM, Takeshi Yoshino tyosh...@google.com wrote:
 What I have in my mind is like this:

 if (this.readyState == this.LOADING) {
   stream = xhr.response;
   // XHR has already written some data x0 to stream
   stream.read().progress(progressHandler);
 }

 ...loop...

 // XHR writes data x1 to stream
 // XHR writes data x2 to stream
 // XHR finishes writing to stream

 progressHandler continues receiving data till EOF. For this read() call
 without maxSize, all of x0, x1 and x2 will be passed to progressHandler.

 I see. I kinda thought that if you omitted size it would just give you
 everything in stream's buffer and not everything until end-of-stream.

If you just want to be notified about data as it comes in you can use
read*Chunked() in my proposal.

Polling data from the buffer seems less useful.

 Do we even need that? It seems just passing ArrayBuffer in and out
 could be sufficient for now?

 As one of read()'s arguments?

 As for what it would return. Or do we have use cases where decoding to
 strings and/or Blobs are important?

Reading any format that contains textual data. I.e. things like HTML,
OpenDocument, pdf, etc. While many of those are compressed, it seems
likely that you could pass a stream through a decompressor which
produces a decompressed stream.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-17 Thread Anne van Kesteren

On Tue, Jul 16, 2013 at 11:10 PM, Jonas Sicking jo...@sicking.cc wrote:
 Reading any format that contains textual data. I.e. things like HTML,
 OpenDocument, pdf, etc. While many of those are compressed, it seems
 likely that you could pass a stream through a decompressor which
 produces a decompressed stream.

Yeah, extending APIs for such features to support streams seems better
than adding support for all of them on Stream. Letting Stream just be
a low-level primitive for a stream of bytes seems good enough.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-07-17 Thread Jonas Sicking

On Wed, Jul 17, 2013 at 10:47 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Tue, Jul 16, 2013 at 11:10 PM, Jonas Sicking jo...@sicking.cc wrote:
 Reading any format that contains textual data. I.e. things like HTML,
 OpenDocument, pdf, etc. While many of those are compressed, it seems
 likely that you could pass a stream through a decompressor which
 produces a decompressed stream.

 Yeah, extending APIs for such features to support streams seems better
 than adding support for all of them on Stream. Letting Stream just be
 a low-level primitive for a stream of bytes seems good enough.

What do you mean by such features? Are you saying that a Stream zip
decompressor should be responsible for both decompressing as well as
binary-text conversion? And thus output something other than a
Stream?

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-17 Thread Anne van Kesteren

On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc wrote:
 What do you mean by such features? Are you saying that a Stream zip
 decompressor should be responsible for both decompressing as well as
 binary-text conversion? And thus output something other than a
 Stream?

I meant that for specialized processing you'd likely want more than
just decoding. You mentioned HTML parsing which requires a fair amount
more. And if it's just decoding, we could extend
TextEncoder/TextDecoder to work with Stream objects.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-07-17 Thread Jonas Sicking

On Wed, Jul 17, 2013 at 11:46 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc wrote:
 What do you mean by such features? Are you saying that a Stream zip
 decompressor should be responsible for both decompressing as well as
 binary-text conversion? And thus output something other than a
 Stream?

 I meant that for specialized processing you'd likely want more than
 just decoding. You mentioned HTML parsing which requires a fair amount
 more.

I don't think you want a HTML parser to do both decoding and parsing.
That would result in a lot of code duplication in each component that
are dealing with textual formats.

 And if it's just decoding, we could extend
 TextEncoder/TextDecoder to work with Stream objects.

Sure, we can do that. The question is, what is the output from the
TextDecoder if you pass it a Stream? A new TextStream type? Is that
really better than adding the text-consuming functions to Stream?

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-07-10 Thread Anne van Kesteren

On Tue, Jul 2, 2013 at 12:21 AM, Takeshi Yoshino tyosh...@google.com wrote:
 What I have in my mind is like this:

 if (this.readyState == this.LOADING) {
   stream = xhr.response;
   // XHR has already written some data x0 to stream
   stream.read().progress(progressHandler);
 }

 ...loop...

 // XHR writes data x1 to stream
 // XHR writes data x2 to stream
 // XHR finishes writing to stream

 progressHandler continues receiving data till EOF. For this read() call
 without maxSize, all of x0, x1 and x2 will be passed to progressHandler.

I see. I kinda thought that if you omitted size it would just give you
everything in stream's buffer and not everything until end-of-stream.


 Do we even need that? It seems just passing ArrayBuffer in and out
 could be sufficient for now?

 As one of read()'s arguments?

As for what it would return. Or do we have use cases where decoding to
strings and/or Blobs are important?


 What's pending read resolvers?

 When any error occurs the stream needs to pending promises. So, I prepared
 that list but I haven't written any text for error handling yet.

Okay.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-07-01 Thread Anne van Kesteren

On Mon, Jul 1, 2013 at 9:03 AM, Takeshi Yoshino tyosh...@google.com wrote:
 Moved to github.
 https://github.com/tyoshino/stream/blob/master/streams.html
 http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html

 Why would it be neutered if size is not given?

 When size is not given, we need to mark it fully read by using something
 else. I changed to use read position == -1.

I'm not sure I follow. Isn't the maxSize argument optional so you can
read all the data queued up thus far? It seems that should just work
and not prevent more data queued in the future to be read from the
stream. (Later on in the algorithm it seems this is acknowledged, but
at that point the stream is already neutered.)


 I think you need to define the stream buffer somewhat more explicitly
 so that only what you decide to read from the buffer ends up in the
 ArrayBuffer and newly queued data while that is happening is not.

 Do you want FIFO model to be empathized?

It doesn't emphasis, it just needs to be clear.


 Probably defining Stream conceptually and defining read() (I don't
 think we should call it readAsArrayBuffer) in terms of those concepts

 You mean that something similar to XHR's responseType is preferred?

Do we even need that? It seems just passing ArrayBuffer in and out
could be sufficient for now?


What's pending read resolvers?


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-06-26 Thread Anne van Kesteren

On Wed, Jun 26, 2013 at 6:48 AM, Takeshi Yoshino tyosh...@google.com wrote:
 I wrote a strawman spec for Stream.readAsArrayBuffer. Comment please.

Calling the stream associated concepts the same as the variables in
the algorithm is somewhat confusing (read_position vs read_position).


 4. If called with the optional size, set the read_position of stream to
 read_position + size.
 5. Otherwise, neuter the stream.

Why would it be neutered if size is not given?


 7. Read data from stream from read_position up to size bytes or all data is
 size is not specified.
 8. As data from the stream becomes available, do the following,

I think you need to define the stream buffer somewhat more explicitly
so that only what you decide to read from the buffer ends up in the
ArrayBuffer and newly queued data while that is happening is not.

Probably defining Stream conceptually and defining read() (I don't
think we should call it readAsArrayBuffer) in terms of those concepts
is better. E.g. similar to how http://url.spec.whatwg.org/ has a model
and an API part.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-18 Thread Takeshi Yoshino

On Sat, May 18, 2013 at 1:38 PM, Jonas Sicking jo...@sicking.cc wrote:

 For File reading I would now instead do something like

 partial interface Blob {
   AbortableProgressFutureArrayBuffer readBinary(BlobReadParams);
   AbortableProgressFutureDOMString readText(BlobReadTextParams);
   Stream readStream(BlobReadParams);


I'd name it asStream. readStream operation here isn't intended to do any
read, i.e. moving data between buffers, (like ArrayBufferView for
ArrayBuffer) right?

Or it's gonna clone the Blob's contents and wrap with the Stream interface
as we cannot discard contents of a Blob and it'll be inconsistent with
the semantics (implication?) we're going to give to the Stream interface?

Re: Overlap between StreamReader and FileReader

2013-05-18 Thread Takeshi Yoshino

On Sat, May 18, 2013 at 1:56 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, May 17, 2013 at 9:38 PM, Jonas Sicking jo...@sicking.cc wrote:
  For Stream reading, I think I would do something like the following:
 
  interface Stream {
AbortableProgressFutureArrayBuffer readBinary(optional unsigned
  long long size);
AbortableProgressFutureString readText(optional unsigned long long
  size, optional DOMString encoding);
AbortableProgressFutureBlob readBlob(optional unsigned long long
 size);
 
ChunkedData readBinaryChunked(optional unsigned long long size);
ChunkedData readTextChunked(optional unsigned long long size);
  };
 
  interface ChunkedData : EventTarget {
attribute EventHandler ondata;
attribute EventHandler onload;
attribute EventHandler onerror;
  };

 Actually, we could even get rid of the ChunkedData interface and do
 something like

 interface Stream {
   AbortableProgressFutureArrayBuffer readBinary(optional unsigned
 long long size);
   AbortableProgressFutureString readText(optional unsigned long long
 size, optional DOMString encoding);
   AbortableProgressFutureBlob readBlob(optional unsigned long long size);

   AbortableProgressFuturevoid readBinaryChunked(optional unsigned
 long long size);
   AbortableProgressFuturevoid readTextChunked(optional unsigned long
 long size);
 };

 where the ProgressFutures returned from
 readBinaryChunked/readBinaryChunked delivers the data in the progress
 notifications only, and no data is delivered when the future is
 actually resolved. Though this might be abusing Futures a bit?


This is also clear read-only-once interface as well as onmessage() approach
because there's no attribute to accumulate the result value. The fact that
the argument for accept callback is void strikes at least me that the value
passed to progress callback is not an accumulated result but each chunk
separately.

As the state transition of Stream would be simple enough to match Future, I
think technically it's ok and even better to employ it than readyState +
callback approach.

But is everyone fine with making it mandatory to get used to programming
with Future to use Stream?

Re: Overlap between StreamReader and FileReader

2013-05-18 Thread Anne van Kesteren

On Sat, May 18, 2013 at 5:56 AM, Jonas Sicking jo...@sicking.cc wrote:
 where the ProgressFutures returned from
 readBinaryChunked/readBinaryChunked delivers the data in the progress
 notifications only, and no data is delivered when the future is
 actually resolved. Though this might be abusing Futures a bit?

Yeah, futures represent a value. This is an event stream (that does
not keep track of history).


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-18 Thread Jonas Sicking

On Sat, May 18, 2013 at 7:36 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Sat, May 18, 2013 at 5:56 AM, Jonas Sicking jo...@sicking.cc wrote:
 where the ProgressFutures returned from
 readBinaryChunked/readBinaryChunked delivers the data in the progress
 notifications only, and no data is delivered when the future is
 actually resolved. Though this might be abusing Futures a bit?

 Yeah, futures represent a value. This is an event stream (that does
 not keep track of history).

It's not exactly an event stream since the exact events isn't what
matters here. I.e. you'll get different events in different
implementations, and there are no guarantees that the events
themselves will be meaningful.

But yeah, I agree it's not representing a value and so it's an abuse
of Future's semantics.

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Anne van Kesteren

On Thu, May 16, 2013 at 10:14 PM, Takeshi Yoshino tyosh...@google.com wrote:
 I skimmed the thread before starting this and saw that you were pointing out
 some issues but didn't think you're opposing so much.

Well yes. I removed integration from XMLHttpRequest a while back too.


 Let me check requirements.

 d) The I/O API needs to work with synchronous XHR.

I'm not sure this is a requirement. In particular in light of
http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/
and synchronous being worker-only it's not entirely clear to me this
needs to be a requirement from the get-go.


 e) Resource for already processed data should be able to be released
 explicitly as the user instructs.

Can't this happen transparently?


 g) The I/O API should allow for skipping unnecessary data without creating a
 new object for that.

This would be equivalent to reading and discarding?


 Not requirement

 h) Some people wanted Stream to behave like not an object to store the data
 but kinda dam put between response attribute and XHR's internal buffer (and
 network stack) expecting that XHR doesn't consume data from the network
 until read operation is invoked on Stream object. (i.e. Stream controls data
 flow in addition to callback invocation timing). But it's no longer
 considered to be a requirement.

I'm not sure what this means. It sounds like something that indeed
should be transparent from an API point-of-view, but it's hard to
tell.


We also need to decide whether a stream supports multiple readers or
whether you need to explicitly clone a stream somehow. And as far as
the API goes, we should study existing libraries.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Anne van Kesteren

On Thu, May 16, 2013 at 8:26 PM, Feras Moussa feras.mou...@hotmail.com wrote:
 Can you please go into a bit more detail? I've read through the thread, and
 it mostly focuses on the details of how a Stream is received from XHR and
 what behaviors can be expected - it only lightly touches on how you can
 operate on a stream after it is received.

The main problem is that Stream per Streams API is not what you expect
from an IO stream, but it's more what Blob should've been (Blob
without synchronous size). What we want I think is a real IO stream.
If we also need Blob without synchronous size is less clear to me.


 I do agree the API
 should allow for scenarios where data can be discarded, given that is an
 advantage of a Stream over a Blob.

It does not seem to do that currently though. It's also not clear to
me we want to allow multiple readers by default.


 That said, Anne, what is your suggestion for how Streams can be consumed?

I don't have one yet.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Takeshi Yoshino

Sorry, I just took over this work and so was misunderstanding some point in
the Streams API spec.

On Fri, May 17, 2013 at 6:09 PM, Anne van Kesteren ann...@annevk.nl wrote:

 On Thu, May 16, 2013 at 10:14 PM, Takeshi Yoshino tyosh...@google.com
 wrote:
  I skimmed the thread before starting this and saw that you were pointing
 out
  some issues but didn't think you're opposing so much.

 Well yes. I removed integration from XMLHttpRequest a while back too.


  Let me check requirements.
 
  d) The I/O API needs to work with synchronous XHR.

 I'm not sure this is a requirement. In particular in light of
 http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/
 and synchronous being worker-only it's not entirely clear to me this
 needs to be a requirement from the get-go.


  e) Resource for already processed data should be able to be released
  explicitly as the user instructs.

 Can't this happen transparently?


Yes. Read data is automatically released model is simple and good.

I thought the spec is clear about this but sorry it isn't. In the spec we
should say that StreamReader invalidates consumed data in Stream and buffer
for the invalidated bytes will be released at that point. Right?


  g) The I/O API should allow for skipping unnecessary data without
 creating a
  new object for that.

 This would be equivalent to reading and discarding?


I wanted to understand clearly what you meant by discard in your posts. I
wondered if you were suggesting that we have some method to skip incoming
data without creating any object holding received data. I.e. something like

s.skip(10);
s.readFrom(10);

not like

var useless_data_at_head_remaining = 256;
ondata = function(evt) {
  var bytes_received = evt.data.size();
  if (useless_data_at_head_remaining  bytes_received) {
useless_data_at_head_remaining -= bytes_received;
return;
  }

  processUsefulData(evt.data.slice(uselesss_data_at_head_remaining));
}

If you meant the latter, I'm ok. I'd also call the latter reading and
discarding.


  Not requirement
 
  h) Some people wanted Stream to behave like not an object to store the
 data
  but kinda dam put between response attribute and XHR's internal buffer
 (and
  network stack) expecting that XHR doesn't consume data from the network
  until read operation is invoked on Stream object. (i.e. Stream controls
 data
  flow in addition to callback invocation timing). But it's no longer
  considered to be a requirement.

 I'm not sure what this means. It sounds like something that indeed
 should be transparent from an API point-of-view, but it's hard to
 tell.


In the thread, Glenn was discussing what's consumer and what's producer,
IIRC.

I supposed that the idea behind Stream is providing a flow control
interface to control XHR has internal buffer. When the internal buffer is
full, it stops reading data from the network (e.g. BSD socket). The buffer
will be drained when and only when read operation is made on the Stream
object.

Stream has infinite length, but shouldn't have infinite capacity. It'll
swell up if the consumer (e.g. media stream?) is slow.

Of course, browsers would set some limit, but it should rather be well
discussed in the spec. Unless the limit is visible to scripts, they cannot
know if it can watch only load event or need to handle progress event
and consume arrived data progressively to process all data.


 We also need to decide whether a stream supports multiple readers or
 whether you need to explicitly clone a stream somehow. And as far as
 the API goes, we should study existing libraries.


What use cases do you have in your mind? Your example in the thread was
passing one to video but also accessing it manually using StreamReader. I
think it's unknown in what timing and how much video consumes data from
the Stream to the script and it's really hard make such coordination
successful.

Are you thinking of use case like mixing chat data and video contents in
the same HTTP response body?

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Anne van Kesteren

On Fri, May 17, 2013 at 12:09 PM, Takeshi Yoshino tyosh...@google.com wrote:
 I thought the spec is clear about this but sorry it isn't. In the spec we
 should say that StreamReader invalidates consumed data in Stream and buffer
 for the invalidated bytes will be released at that point. Right?

I'm glad we're all getting on the same page now. I think there might
be use cases for a Blob without size (i.e. where you do not discard
the data after consuming) which is what Stream seems to be today, but
I'm not sure we should call that Stream.

And I think for XMLHttpRequest at least we want an API where data can
be discarded once processed so you do not have to keep multi-megabyte
sound files on disk if all you want is to provide a (potentially
post-processed) live stream.


 I wanted to understand clearly what you meant by discard in your posts. I
 wondered if you were suggesting that we have some method to skip incoming
 data without creating any object holding received data. I.e. something like

 s.skip(10);
 s.readFrom(10);

 not like

 var useless_data_at_head_remaining = 256;
 ondata = function(evt) {
   var bytes_received = evt.data.size();
   if (useless_data_at_head_remaining  bytes_received) {
 useless_data_at_head_remaining -= bytes_received;
 return;
   }

   processUsefulData(evt.data.slice(uselesss_data_at_head_remaining));
 }

 If you meant the latter, I'm ok. I'd also call the latter reading and
 discarding.

Yeah that seems about right.


 What use cases do you have in your mind? Your example in the thread was
 passing one to video but also accessing it manually using StreamReader. I
 think it's unknown in what timing and how much video consumes data from
 the Stream to the script and it's really hard make such coordination
 successful.

 Are you thinking of use case like mixing chat data and video contents in the
 same HTTP response body?

I haven't really thought about what I'd use it for, but I looked at
e.g. Dart and it seems to have a concept of broadcasted streams. Maybe
analyze the incoming bits in one function and in another you'd process
the incoming data and do something with it. Above all though it needs
to be clear what happens and for IO streams where you do not want to
keep all the data around (e.g. unlike the current Streams API) it's a
question that needs answering.


--
http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Takeshi Yoshino

On Fri, May 17, 2013 at 6:15 PM, Anne van Kesteren ann...@annevk.nl wrote:

 The main problem is that Stream per Streams API is not what you expect
  from an IO stream, but it's more what Blob should've been (Blob
 without synchronous size). What we want I think is a real IO stream.
 If we also need Blob without synchronous size is less clear to me.


Forgetting File API completely, for example, ... how about simple socket
like interface?

// Downloading big data

var remaining;
var type = null;
var payload = '';
function processData(data) {
  var offset = 0;
  while (offset  data.length) {
if (!type) {
  var type = data.substr(offset, 1);
  remaining = payloadSize(type);
} else if (remaining  0) {
  var consume = Math.min(remaining, data.length - offset);
  payload += data.substr(offset, consume);
  offset += consume;
} else if (remaining == 0) {
  if (type == FOO) {
foo(payload);
  } else if (type == BAR) {
bar(payload);
  }
  type = null;
}
  }
}

var client = new XMLHttpRequest();
client.onreadystatechange = function() {
  if (this.readyState == this.LOADING) {
var responseStream = this.response;
responseStream.setBufferSize(1024);
responseStream.ondata = function(evt) {
  processData(evt.data);
  // Consumed data will be invalidated and memory used for the data
will be released.
};
responseStream.onclose = function() {
  // Reached end of response body
  ...
};
responseStream.start();
// Now responseStream starts forwarding events happen on XHR to its
callbacks.
  }
};
client.open(GET, /foobar);
client.responseType = stream;
client.send();

// Uploading big data

var client = new XMLHttpRequest();
client.open(POST, /foobar);

var requestStream = new WriteStream(1024);

var producer = new Producer();
producer.ondata = function(evt.data) {
  requestStream.send(evt.data);
};
producer.onclose = function() {
  requestStream.close();
};

client.send(requestStream);

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Jonas Sicking

I figured I should chime in with some ideas of my own because, well, why not :-)

First off, I definitely think the semantic model of a Stream shouldn't
be a Blob without a size, but rather a Blob without a size that you
can only read from once. I.e. the implementation should be able to
discard data as it passes it to a reader.

That said, many Stream APIs support the concept of a T. This enables
splitting a Stream into two Streams. This enables having multiple
consumers of the same data source. However when a T is created, it
only returns the data that has so far been unread from the original
Stream. It does not return the data from the beginning of the stream
since that would prevent streams from discarding data as soon as it
has been read.

If we are going to have a StreamReader API, then I don't think we
should model it after FileReader. FileReader unfortunately followed
the model of XMLHttpRequest (based on request from several
developers), however this is a pretty terrible API, and I believe we
can do much better. And obviously we should do something based on
Futures :-)

For File reading I would now instead do something like

partial interface Blob {
  AbortableProgressFutureArrayBuffer readBinary(BlobReadParams);
  AbortableProgressFutureDOMString readText(BlobReadTextParams);
  Stream readStream(BlobReadParams);
};

dictionary BlobReadParams {
  long long start;
  long long length;
};

dictionary BlobReadTextParams : BlobReadParams {
  DOMString encoding;
};

For Stream reading, I think I would do something like the following:

interface Stream {
  AbortableProgressFutureArrayBuffer readBinary(optional unsigned
long long size);
  AbortableProgressFutureString readText(optional unsigned long long
size, optional DOMString encoding);
  AbortableProgressFutureBlob readBlob(optional unsigned long long size);

  ChunkedData readBinaryChunked(optional unsigned long long size);
  ChunkedData readTextChunked(optional unsigned long long size);
};

interface ChunkedData : EventTarget {
  attribute EventHandler ondata;
  attribute EventHandler onload;
  attribute EventHandler onerror;
};

For all of the above function, if a size is not passed, the rest of
the Stream is read.

The ChunkedData interface allows incremental reading of a stream. I.e.
as soon as there is data available a data event is fired on the
ChunkedData object which contains the data since last data event
fired. Once we've reached the end of the stream, or the requested
size, the load event is fired on the ChunkedData object.

So the read* functions allow a consumer to pull data, whereas the
read*Chunked allow consumers to have the data pushed at them. There's
also other potential functions we can add which allow hybrids, but
that seems overly complex for now.

Other functions we could add is peekText and peekBinary which allows
looking into the stream to determine if you're able to consume the
data that's there, or if you should pass the Stream to some other
consumer.

We might also want to add a eof flag to the Stream interface, as
well as an event which is fired when the end of the stream is reached
(or should that be modeled using a Future?)

/ Jonas

On Fri, May 17, 2013 at 5:02 AM, Takeshi Yoshino tyosh...@google.com wrote:
 On Fri, May 17, 2013 at 6:15 PM, Anne van Kesteren ann...@annevk.nl wrote:

 The main problem is that Stream per Streams API is not what you expect
 from an IO stream, but it's more what Blob should've been (Blob
 without synchronous size). What we want I think is a real IO stream.
 If we also need Blob without synchronous size is less clear to me.


 Forgetting File API completely, for example, ... how about simple socket
 like interface?

 // Downloading big data

 var remaining;
 var type = null;
 var payload = '';
 function processData(data) {
   var offset = 0;
   while (offset  data.length) {
 if (!type) {
   var type = data.substr(offset, 1);
   remaining = payloadSize(type);
 } else if (remaining  0) {
   var consume = Math.min(remaining, data.length - offset);
   payload += data.substr(offset, consume);
   offset += consume;
 } else if (remaining == 0) {
   if (type == FOO) {
 foo(payload);
   } else if (type == BAR) {
 bar(payload);
   }
   type = null;
 }
   }
 }

 var client = new XMLHttpRequest();
 client.onreadystatechange = function() {
   if (this.readyState == this.LOADING) {
 var responseStream = this.response;
 responseStream.setBufferSize(1024);
 responseStream.ondata = function(evt) {
   processData(evt.data);
   // Consumed data will be invalidated and memory used for the data will
 be released.
 };
 responseStream.onclose = function() {
   // Reached end of response body
   ...
 };
 responseStream.start();
 // Now responseStream starts forwarding events happen on XHR to its
 callbacks.
   }
 };
 client.open(GET, /foobar);
 client.responseType = stream;
 client.send();

 //

Re: Overlap between StreamReader and FileReader

2013-05-17 Thread Jonas Sicking

On Fri, May 17, 2013 at 9:38 PM, Jonas Sicking jo...@sicking.cc wrote:
 For Stream reading, I think I would do something like the following:

 interface Stream {
   AbortableProgressFutureArrayBuffer readBinary(optional unsigned
 long long size);
   AbortableProgressFutureString readText(optional unsigned long long
 size, optional DOMString encoding);
   AbortableProgressFutureBlob readBlob(optional unsigned long long size);

   ChunkedData readBinaryChunked(optional unsigned long long size);
   ChunkedData readTextChunked(optional unsigned long long size);
 };

 interface ChunkedData : EventTarget {
   attribute EventHandler ondata;
   attribute EventHandler onload;
   attribute EventHandler onerror;
 };

Actually, we could even get rid of the ChunkedData interface and do
something like

interface Stream {
  AbortableProgressFutureArrayBuffer readBinary(optional unsigned
long long size);
  AbortableProgressFutureString readText(optional unsigned long long
size, optional DOMString encoding);
  AbortableProgressFutureBlob readBlob(optional unsigned long long size);

  AbortableProgressFuturevoid readBinaryChunked(optional unsigned
long long size);
  AbortableProgressFuturevoid readTextChunked(optional unsigned long
long size);
};

where the ProgressFutures returned from
readBinaryChunked/readBinaryChunked delivers the data in the progress
notifications only, and no data is delivered when the future is
actually resolved. Though this might be abusing Futures a bit?

/ Jonas

Re: Overlap between StreamReader and FileReader

2013-05-16 Thread Anne van Kesteren

On Thu, May 16, 2013 at 5:58 PM, Takeshi Yoshino tyosh...@google.com wrote:
 StreamReader proposed in the Streams API spec is almost the same as
 FileReader. By adding the maxSize argument to the readAs methods (new
 methods or just add it to existing methods as an optional argument) and
 adding the readAsBlob method, FileReader can cover all what StreamReader
 provides. Has this already been discussed here?

 I heard that some people who had this concern discussed briefly and were
 worrying about derailing File API standardization.

 We're planning to implement it on Chromium/Blink shortly.

The Streams API
https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm is no
good as far as I can tell. We need something else for IO. (See various
threads on this list by me.)

Alex will tell you the same so I doubt it'd get through Blink API review.


--
http://annevankesteren.nl/

RE: Overlap between StreamReader and FileReader

2013-05-16 Thread Travis Leithead

 From: annevankeste...@gmail.com [mailto:annevankeste...@gmail.com]
 
 On Thu, May 16, 2013 at 5:58 PM, Takeshi Yoshino tyosh...@google.com
 wrote:
  StreamReader proposed in the Streams API spec is almost the same as
  FileReader. By adding the maxSize argument to the readAs methods (new
  methods or just add it to existing methods as an optional argument)
  and adding the readAsBlob method, FileReader can cover all what
  StreamReader provides. Has this already been discussed here?
 
  I heard that some people who had this concern discussed briefly and
  were worrying about derailing File API standardization.
 
  We're planning to implement it on Chromium/Blink shortly.
 
 The Streams API
 https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm is no good as
 far as I can tell. We need something else for IO. (See various threads on this
 list by me.)
 
 Alex will tell you the same so I doubt it'd get through Blink API review.

Since we have Streams implemented to some degree, I'd love to hear suggestions 
to improve it relative to IO. Anne can you summarize the points you've made on 
the other various threads?

RE: Overlap between StreamReader and FileReader

2013-05-16 Thread Travis Leithead

 From: annevankeste...@gmail.com [mailto:annevankeste...@gmail.com]
 On Thu, May 16, 2013 at 6:31 PM, Travis Leithead
 travis.leith...@microsoft.com wrote:
  Since we have Streams implemented to some degree, I'd love to hear
 suggestions to improve it relative to IO. Anne can you summarize the points
 you've made on the other various threads?
 
 I recommend reading through
 http://lists.w3.org/Archives/Public/public-
 webapps/2013JanMar/thread.html#msg569
 
 Problems:
 
 * Too much complexity for being an Blob without synchronous size.
 * The API is bad. The API for File is bad too, but we cannot change it, this
 however is new.
 
 And I think we really want an IO API that's not about incremental, but can
 actively discard incoming data once it's processed.

Thanks, I'll review the threads and think about this a bit more.

RE: Overlap between StreamReader and FileReader

2013-05-16 Thread Feras Moussa

Can you please go into a bit more detail? I've read through the thread, and it 
mostly focuses on the details of how a Stream is received from XHR and what 
behaviors can be expected - it only lightly touches on how you can operate on a 
stream after it is received.
The StreamReader by design mimics the FileReader, in order to give a consistent 
experience to developers. If we agree the FileReader has some flaws and we want 
to take an opportunity to address them with StreamReader, or an alternative, 
then I think that is reasonable. I do agree the API should allow for scenarios 
where data can be discarded, given that is an advantage of a Stream over a Blob.
That said, Anne, what is your suggestion for how Streams can be consumed?
Also, apologies for being a bit late to the conversation - I missed the 
conversations the past month. I'm now hoping to solicit more feedback and 
update the Streams spec accordingly.

 Date: Thu, 16 May 2013 18:41:21 +0100
 From: ann...@annevk.nl
 To: travis.leith...@microsoft.com
 CC: tyosh...@google.com; slightly...@google.com; public-webapps@w3.org
 Subject: Re: Overlap between StreamReader and FileReader
 
 On Thu, May 16, 2013 at 6:31 PM, Travis Leithead
 travis.leith...@microsoft.com wrote:
  Since we have Streams implemented to some degree, I'd love to hear 
  suggestions to improve it relative to IO. Anne can you summarize the points 
  you've made on the other various threads?
 
 I recommend reading through
 http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/thread.html#msg569
 
 Problems:
 
 * Too much complexity for being an Blob without synchronous size.
 * The API is bad. The API for File is bad too, but we cannot change
 it, this however is new.
 
 And I think we really want an IO API that's not about incremental, but
 can actively discard incoming data once it's processed.
 
 
 --
 http://annevankesteren.nl/

Re: Overlap between StreamReader and FileReader

2013-05-16 Thread Takeshi Yoshino

I skimmed the thread before starting this and saw that you were pointing
out some issues but didn't think you're opposing so much.



Let me check requirements.

a) We don't want to introduce a completely new object for streaming HTTP
read/write, but we'll realize it by adding some extension to XHR.

b) The point to connect the I/O API and XHR should be only the send()
method argument and xhr.response attribute if possible.

c) The semantics (attribute X is valid when state is ..., etc.) should be
kept same as other modes.

d) The I/O API needs to work with synchronous XHR.

e) Resource for already processed data should be able to be released
explicitly as the user instructs.

f) Reading with maxSize argument (don't read too much).

g) The I/O API should allow for skipping unnecessary data without creating
a new object for that.

Not requirement

h) Some people wanted Stream to behave like not an object to store the data
but kinda dam put between response attribute and XHR's internal buffer (and
network stack) expecting that XHR doesn't consume data from the network
until read operation is invoked on Stream object. (i.e. Stream controls
data flow in addition to callback invocation timing). But it's no longer
considered to be a requirement.

i) Reading with size argument (invoke callback only when data of the
specified amount is ready. Only data of the specified size at the head of
stream is passed to the handler)


On Fri, May 17, 2013 at 2:41 AM, Anne van Kesteren ann...@annevk.nl wrote:

 On Thu, May 16, 2013 at 6:31 PM, Travis Leithead
 travis.leith...@microsoft.com wrote:
  Since we have Streams implemented to some degree, I'd love to hear
 suggestions to improve it relative to IO. Anne can you summarize the points
 you've made on the other various threads?

 I recommend reading through

 http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/thread.html#msg569

 Problems:

 * Too much complexity for being an Blob without synchronous size.
 * The API is bad. The API for File is bad too, but we cannot change
 it, this however is new.

 And I think we really want an IO API that's not about incremental, but
 can actively discard incoming data once it's processed.


 --
 http://annevankesteren.nl/

77 matches

Mail list logo