Re: Overlap between StreamReader and FileReader
The idea did not come from mimicing WebRTC: - pause/unpause: insert pause in the stream, stop processing the data when pause is reached (but don't close the operation, see below), buffer next data coming in, restart from pause on unpause Use case: flow control, window flow control gets empty, wait signal from the receiver to reinitialize the window and restart - stop/resume : different from close, stop: insert a specific eof-stop in the stream, the API closes the operation while receiving it, buffer data, restart the operation on resume in the state it was before receiving eof-stop It's more tricky, use case is the one I gave before: specific progressive hash, close a hash and resume it from the state it was before closing it, the feature has been asked several time to node for example. Whether it's implementable, I don't know, but I don't see why it could not be, uses cases are real (myself but I am not the only one) Regards, Aymeric Le 30/10/2013 12:49, Takeshi Yoshino a écrit : On Wed, Oct 30, 2013 at 8:14 PM, Takeshi Yoshino tyosh...@google.com mailto:tyosh...@google.com wrote: On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: - pause: pause the stream, do not send eof Sorry, what will be paused? Output? http://lists.w3.org/Archives/Public/public-webrtc/2013Oct/0059.html http://www.w3.org/2011/04/webrtc/wiki/Transport_Control#Pause.2Fresume So, you're suggesting that we make Stream be a convenient point where we can dam up data flow and skip adding methods to pausing data producing and consuming to producer/consumer APIs? I.e. we make it able to prevent data queued in a Stream from being read. This typically means asynchronously suspending ongoing pipe() or read() call on the Stream with no-argument or very large argument. - unpause: restart the stream And flow control should be back and explicit, not sure right now how to define it but I think it's impossible for a js app to do a precise flow control, and for existing APIs like WebSockets it's not easy to control the flow and avoid in some situations to overload the UA. -- Peersm : http://www.peersm.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms
Re: Overlap between StreamReader and FileReader
On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte vitteayme...@gmail.comwrote: Your filter idea seems to be equivalent to a createStream that I suggested some time ago (like node), what about: var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream).createStream(); So you don't need to modify the APIs where you can not specify the responseType. I was thinking to add stop/resume and pause/unpause: - stop: insert eof in the stream close() does this. Example : finalize the hash when eof is received - resume: restart from where the stream stopped Example : restart the hash from the state the operation was before receiving eof (related to Issue22 in WebCrypto that was closed without any solution, might imply to clone the state of the operation) Should it really be a part of Streams API? How about just making the filter (not Stream itself) returned by WebCrypto reusable and add some method to recycle it? - pause: pause the stream, do not send eof Sorry, what will be paused? Output? - unpause: restart the stream And flow control should be back and explicit, not sure right now how to define it but I think it's impossible for a js app to do a precise flow control, and for existing APIs like WebSockets it's not easy to control the flow and avoid in some situations to overload the UA. Regards, Aymeric Le 21/10/2013 13:14, Takeshi Yoshino a écrit : Sorry for blank of ~2 weeks. On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.comwrote: I am still not very familiar with promises, but if I take your preceeding example: var sourceStream = xhr.response; var resultStream = new Stream(); var fileWritingPromise = fileWriter.write(resultStream); var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream, resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); I made a mistake. The argument of Promise.all should be an Array. So, [fileWritingPromise, encryptionPromise]. shoud'nt it be more something like: var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var resultStream=sourceStream.pipe(encryptionPromise); var fileWritingPromise = fileWriter.write(resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); Promises just tell the user completion of each operation with some value indicating the result of the operation. It's not destination of data. Do you think it's good to create objects representing each encrypt operation? So, some objects called filter is introduced and the code would be like: var pipeToFilterPromise; var encryptionFilter; var fileWriter; xhr.onreadystatechange = function() { ... } else if (this.readyState == this.LOADING) { if (this.status != 200) { ... } var sourceStream = xhr.response; encryptionFilter = crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey); // Starts the filter. var encryptionPromise = encryptionFilter.encrypt(); // Also starts pouring data but separately from promise creation. pipeToFilterPromise = sourceStream.pipe(encryptionFilter); fileWriter = ...; // encryptionFilter works as data producer for FileWriter. var fileWritingPromise = fileWriter.write(encryptionFilter); // Set only handler for rejection now. pipeToFilterPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); fileWriter.abort(); } ); encryptionPromise.catch( function(result) { xhr.abort(); fileWriter.abort(); } ); fileWritingPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); } ); // As encryptionFilter will be (successfully) closed only // when XMLHttpRequest and pipe() are both successful. // So, it's ok to set handler for fulfillment now. Promise.all([encryptionPromise, fileWritingPromise]).then( function(result) { // Done everything successfully! // We come here only when encryptionFilter is close()-ed. fileWriter.close(); processFile(); } ); } else if (this.readyState == this.DONE) { if (this.status != 200) { encryptionFilter.abort(); fileWriter.abort(); } else { // Now we know that XHR was successful. // Let's close() the filter to finish encryption // successfully. pipeToFilterPromise.then( function(result) { // XMLHttpRequest closes sourceStream but pipe() // resolves pipeToFilterPromise without closing // encryptionFilter. encryptionFilter.close(); } ); } } }; xhr.send(); encryptionFilter has the same interface as normal stream but
Re: Overlap between StreamReader and FileReader
On Wed, Oct 30, 2013 at 8:14 PM, Takeshi Yoshino tyosh...@google.comwrote: On Wed, Oct 23, 2013 at 11:42 PM, Aymeric Vitte vitteayme...@gmail.comwrote: - pause: pause the stream, do not send eof Sorry, what will be paused? Output? http://lists.w3.org/Archives/Public/public-webrtc/2013Oct/0059.html http://www.w3.org/2011/04/webrtc/wiki/Transport_Control#Pause.2Fresume So, you're suggesting that we make Stream be a convenient point where we can dam up data flow and skip adding methods to pausing data producing and consuming to producer/consumer APIs? I.e. we make it able to prevent data queued in a Stream from being read. This typically means asynchronously suspending ongoing pipe() or read() call on the Stream with no-argument or very large argument. - unpause: restart the stream And flow control should be back and explicit, not sure right now how to define it but I think it's impossible for a js app to do a precise flow control, and for existing APIs like WebSockets it's not easy to control the flow and avoid in some situations to overload the UA.
Re: Overlap between StreamReader and FileReader
Your filter idea seems to be equivalent to a createStream that I suggested some time ago (like node), what about: var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream).createStream(); So you don't need to modify the APIs where you can not specify the responseType. I was thinking to add stop/resume and pause/unpause: - stop: insert eof in the stream Example : finalize the hash when eof is received - resume: restart from where the stream stopped Example : restart the hash from the state the operation was before receiving eof (related to Issue22 in WebCrypto that was closed without any solution, might imply to clone the state of the operation) - pause: pause the stream, do not send eof - unpause: restart the stream And flow control should be back and explicit, not sure right now how to define it but I think it's impossible for a js app to do a precise flow control, and for existing APIs like WebSockets it's not easy to control the flow and avoid in some situations to overload the UA. Regards, Aymeric Le 21/10/2013 13:14, Takeshi Yoshino a écrit : Sorry for blank of ~2 weeks. On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: I am still not very familiar with promises, but if I take your preceeding example: var sourceStream = xhr.response; var resultStream = new Stream(); var fileWritingPromise = fileWriter.write(resultStream); var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream, resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); I made a mistake. The argument of Promise.all should be an Array. So, [fileWritingPromise, encryptionPromise]. shoud'nt it be more something like: var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var resultStream=sourceStream.pipe(encryptionPromise); var fileWritingPromise = fileWriter.write(resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); Promises just tell the user completion of each operation with some value indicating the result of the operation. It's not destination of data. Do you think it's good to create objects representing each encrypt operation? So, some objects called filter is introduced and the code would be like: var pipeToFilterPromise; var encryptionFilter; var fileWriter; xhr.onreadystatechange = function() { ... } else if (this.readyState == this.LOADING) { if (this.status != 200) { ... } var sourceStream = xhr.response; encryptionFilter = crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey); // Starts the filter. var encryptionPromise = encryptionFilter.encrypt(); // Also starts pouring data but separately from promise creation. pipeToFilterPromise = sourceStream.pipe(encryptionFilter); fileWriter = ...; // encryptionFilter works as data producer for FileWriter. var fileWritingPromise = fileWriter.write(encryptionFilter); // Set only handler for rejection now. pipeToFilterPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); fileWriter.abort(); } ); encryptionPromise.catch( function(result) { xhr.abort(); fileWriter.abort(); } ); fileWritingPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); } ); // As encryptionFilter will be (successfully) closed only // when XMLHttpRequest and pipe() are both successful. // So, it's ok to set handler for fulfillment now. Promise.all([encryptionPromise, fileWritingPromise]).then( function(result) { // Done everything successfully! // We come here only when encryptionFilter is close()-ed. fileWriter.close(); processFile(); } ); } else if (this.readyState == this.DONE) { if (this.status != 200) { encryptionFilter.abort(); fileWriter.abort(); } else { // Now we know that XHR was successful. // Let's close() the filter to finish encryption // successfully. pipeToFilterPromise.then( function(result) { // XMLHttpRequest closes sourceStream but pipe() // resolves pipeToFilterPromise without closing // encryptionFilter. encryptionFilter.close(); } ); } } }; xhr.send(); encryptionFilter has the same interface as normal stream but encrypts piped data. Encrypted data is readable from it. It has special methods, encrypt() and abort(). processFile() is hypothetical function must be called only when all of loading, encryption and saving to file were successful. or var sourceStream = xhr.response; var encryptionPromise =
Re: Overlap between StreamReader and FileReader
Sorry for blank of ~2 weeks. On Fri, Oct 4, 2013 at 5:57 PM, Aymeric Vitte vitteayme...@gmail.comwrote: I am still not very familiar with promises, but if I take your preceeding example: var sourceStream = xhr.response; var resultStream = new Stream(); var fileWritingPromise = fileWriter.write(resultStream); var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream, resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); I made a mistake. The argument of Promise.all should be an Array. So, [fileWritingPromise, encryptionPromise]. shoud'nt it be more something like: var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var resultStream=sourceStream.pipe(encryptionPromise); var fileWritingPromise = fileWriter.write(resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); Promises just tell the user completion of each operation with some value indicating the result of the operation. It's not destination of data. Do you think it's good to create objects representing each encrypt operation? So, some objects called filter is introduced and the code would be like: var pipeToFilterPromise; var encryptionFilter; var fileWriter; xhr.onreadystatechange = function() { ... } else if (this.readyState == this.LOADING) { if (this.status != 200) { ... } var sourceStream = xhr.response; encryptionFilter = crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey); // Starts the filter. var encryptionPromise = encryptionFilter.encrypt(); // Also starts pouring data but separately from promise creation. pipeToFilterPromise = sourceStream.pipe(encryptionFilter); fileWriter = ...; // encryptionFilter works as data producer for FileWriter. var fileWritingPromise = fileWriter.write(encryptionFilter); // Set only handler for rejection now. pipeToFilterPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); fileWriter.abort(); } ); encryptionPromise.catch( function(result) { xhr.abort(); fileWriter.abort(); } ); fileWritingPromise.catch( function(result) { xhr.abort(); encryptionFilter.abort(); } ); // As encryptionFilter will be (successfully) closed only // when XMLHttpRequest and pipe() are both successful. // So, it's ok to set handler for fulfillment now. Promise.all([encryptionPromise, fileWritingPromise]).then( function(result) { // Done everything successfully! // We come here only when encryptionFilter is close()-ed. fileWriter.close(); processFile(); } ); } else if (this.readyState == this.DONE) { if (this.status != 200) { encryptionFilter.abort(); fileWriter.abort(); } else { // Now we know that XHR was successful. // Let's close() the filter to finish encryption // successfully. pipeToFilterPromise.then( function(result) { // XMLHttpRequest closes sourceStream but pipe() // resolves pipeToFilterPromise without closing // encryptionFilter. encryptionFilter.close(); } ); } } }; xhr.send(); encryptionFilter has the same interface as normal stream but encrypts piped data. Encrypted data is readable from it. It has special methods, encrypt() and abort(). processFile() is hypothetical function must be called only when all of loading, encryption and saving to file were successful. or var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var hashPromise = crypto.subtle.digest(hash); var resultStream = sourceStream.pipe([encryptionPromise,hashPromise]); var fileWritingPromise = fileWriter.write(resultStream); Promise.all([fileWritingPromise, resultStream]).then( ... ); and this should be: var sourceStream = xhr.response; encryptionFilter = crypto.subtle.createEncryptionFilter(aesAlgorithmEncrypt, aesKey); var encryptionPromise = encryptionFilter.crypt(); hashFilter = crypto.subtle.createDigestFilter(hash); var hashPromise = hashFilter.digest(); pipeToFiltersPromise = sourceStream.pipe([encryptionFilter, hashFilter]); var encryptedDataWritingPromise = fileWriter.write(encryptionFilter); var hashWritingPromise = Promise.all([encryptionPromise, encryptedDataWritingPromise]).then( function(result) { return fileWriter.write(hashFilter) }, ... ); Promise.all([hashPromise, hashWritingPromise]).then( function(result) { fileWriter.close(); processFile(); }, ... ); Or, we can also choose to let the writer API to create a special object that has the Stream interface for receiving input and then let encryptionFilter and hashFilter to pipe() to it. ... pipeToFiltersPromise =
Re: Overlap between StreamReader and FileReader
I am still not very familiar with promises, but if I take your preceeding example: var sourceStream = xhr.response; var resultStream = new Stream(); var fileWritingPromise = fileWriter.write(resultStream); var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream, resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); shoud'nt it be more something like: var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var resultStream=sourceStream.pipe(encryptionPromise); var fileWritingPromise = fileWriter.write(resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); or var sourceStream = xhr.response; var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey); var hashPromise = crypto.subtle.digest(hash); var resultStream = sourceStream.pipe([encryptionPromise,hashPromise]); var fileWritingPromise = fileWriter.write(resultStream); Promise.all(fileWritingPromise, resultStream).then( ... ); Regards Aymeric Le 03/10/2013 10:27, Takeshi Yoshino a écrit : Formatted and published my latest proposal at github after incorporating Aymeric's multi-dest idea. http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html On Sat, Sep 28, 2013 at 11:45 AM, Kenneth Russell k...@google.com mailto:k...@google.com wrote: This looks nice. It looks like it should already handle the flow control issues mentioned earlier in the thread, simply by performing the read on demand, though reporting the result asynchronously. Thanks, Kenneth for reviewing. -- Peersm : http://www.peersm.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms
Re: Overlap between StreamReader and FileReader
Formatted and published my latest proposal at github after incorporating Aymeric's multi-dest idea. http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html On Sat, Sep 28, 2013 at 11:45 AM, Kenneth Russell k...@google.com wrote: This looks nice. It looks like it should already handle the flow control issues mentioned earlier in the thread, simply by performing the read on demand, though reporting the result asynchronously. Thanks, Kenneth for reviewing.
Re: Overlap between StreamReader and FileReader
Looks good, comments/questions : - what's the use of readEncoding? - StreamReadType: add MediaStream? (and others if existing) - would it be possible to pipe from StreamReadType to other StreamReadType? - would it be possible to pipe from a source to different targets (my example of encrypt/hash at the same time)? - what is the link between the API and the Stream (responseType='stream')? How do you handle this for APIs where responseType does not really apply (mspack, crypto...) Regards Aymeric Le 26/09/2013 06:17, Takeshi Yoshino a écrit : As we don't see any strong demand for flow control and sync read functionality, I've revised the proposal. Though we can separate state/error signaling from Stream and keep them done by each API (e.g. XHR) as Aymeric said, EoF signal still needs to be conveyed through Stream. enum StreamReadType { , blob, arraybuffer, text }; interface StreamConsumeResult { readonly attribute boolean eof; readonly any data; readonly unsigned long long size; }; [Constructor(optional DOMString mime)] interface Stream { readonly attribute DOMString type; // MIME type // Rejected on error. No more write op shouldn't be made. // // Fulfilled when the write completes. It doesn't guarantee that the written data has been // read out successfully. // // The contents of ArrayBufferView must not be modified until the promise is fulfilled. // // Fulfill may be delayed when the Stream considers itself to be full. // // write(), close() must not be called again until the Promise of the last write() is fulfilled. Promisevoid write((DOMString or ArrayBufferView or Blob)? data); void close(); attribute StreamReadType readType; attribute DOMString readEncoding; // read(), skip(), pipe() must not be called again until the Promise of the last read(), skip(), pipe() is fulfilled. // Rejected on error. No more read op shouldn't be made. // // If size is specified, // - if EoF: fulfilled with data up to EoF // - otherwise: fulfilled with data of size bytes // // If size is omitted, (all or part of) data available for read now will be returned. // // If readType is set to text, size of the result may be smaller than the value specified for the size argument. PromiseStreamConsumeResult read(optional [Clamp] long long size); // Rejected on error. Fulfilled on completion. // // .data of result is not used. .size of result is the skipped amount. PromiseStreamConsumeResult skip([Clamp] long long size); // .data is skipped size // Rejected on error. Fulfilled on completion. // // If size is omitted, transfer until EoF is encountered. // // .data of result is not used. .size of result is the size of data transferred. PromiseStreamConsumeResult pipe(Stream destination, optional [Clamp] long long size); }; -- Peersm : http://www.peersm.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms
Re: Overlap between StreamReader and FileReader
On Thu, Sep 26, 2013 at 6:36 PM, Aymeric Vitte vitteayme...@gmail.comwrote: Looks good, comments/questions : - what's the use of readEncoding? Overriding charset specified in .type for read op. It's weird but we can ask an app to overwrite .type instead. - StreamReadType: add MediaStream? (and others if existing) Maybe, if there's clear rule to convert binary stream + MIME type into MediaStream object. - would it be possible to pipe from StreamReadType to other StreamReadType? pipe() tells the receiver with which value of StreamReadType the pipe() was called. Receiver APIs may be designed to accept either mode or both modes. - would it be possible to pipe from a source to different targets (my example of encrypt/hash at the same time)? I missed it. Your mirroring method (making pipe accept multiple Stream) looks good. The problem is what to do when one of destinations is write blocked. Maybe we want to read data from the source as the fastest consumer consumes and save read data for slowest one. When should we fulfill the promise? Completion of read from the source, completion of write to all destinations, etc. - what is the link between the API and the Stream (responseType='stream')? How do you handle this for APIs where responseType does not really apply (mspack, crypto...) - make APIs to return a Stream for read (write) like XHR.responseType='stream' - make APIs to accept a Stream for read (write) Either should work as we have pipe(). E.g. var sourceStream = xhr.response; var resultStream = new Stream(); var fileWritingPromise = fileWriter.write(resultStream); var encryptionPromise = crypto.subtle.encrypt(aesAlgorithmEncrypt, aesKey, sourceStream, resultStream); Promise.all(fileWritingPromise, encryptionPromise).then( ... ); I also found a point needs clarification - pipe() does eof or not. I think we don't want automatic eof.
Re: Overlap between StreamReader and FileReader
Le 24/09/2013 21:24, Takeshi Yoshino a écrit : On Wed, Sep 25, 2013 at 12:41 AM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: Did you see http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html ? Yes. This example seems to be showing how to connect only producer/consumer APIs which support Stream. Right? Yes but if something like createStream is generic then any API could support it, for further APIs or next versions it can be built in. In such a case, all the flow control stuff would be basically hidden, and if necessary each producer/consumer/transformer/filter/etc. may expose flow control related parameter in their own form, and configure connected input/output streams accordingly. E.g. stream_xhr may choose to have large write buffer for performance, or have small one and make some backpressure to stream_ws1 for memory efficiency. Yes My understanding is that the flow control APIs like mine are intended to be used by JS code implementing some converter, consumer, etc. while built-in stuff like WebCrypt would be evolved to accept Stream directly and handle flow control in e.g. C++ world. BTW, I'm discussing this to provide data points to decide whether to include flow control API or not. I'm not pushing it. I appreciate if other participants express opinions about this. Not sure to get what you mean between your API flow control and built-in flow control... I think the main purpose of the Stream API should be to handle more efficiently streaming without having to handle ArrayBuffers copy, split, concat, etc, to abstract the use of ArrayBuffer, ArrayBufferView, Blob, txt so you don't spend your time converting things and to connect simply different streams. -- Peersm : http://www.peersm.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms
Re: Overlap between StreamReader and FileReader
On Wed, Sep 25, 2013 at 10:55 PM, Aymeric Vitte vitteayme...@gmail.comwrote: My understanding is that the flow control APIs like mine are intended to be used by JS code implementing some converter, consumer, etc. while built-in stuff like WebCrypt would be evolved to accept Stream directly and handle flow control in e.g. C++ world. BTW, I'm discussing this to provide data points to decide whether to include flow control API or not. I'm not pushing it. I appreciate if other participants express opinions about this. Not sure to get what you mean between your API flow control and built-in flow control... I think the main purpose of the Stream API should be to handle more efficiently streaming without having to handle ArrayBuffers copy, split, concat, etc, to abstract the use of ArrayBuffer, ArrayBufferView, Blob, txt so you don't spend your time converting things and to connect simply different streams. JS flow control API is for JS code to manually control threshold, buffer size, etc. so that JS code can consume, produce data to/from Stream. Built-in flow control is C++ (or any other lang implementing the UA) interface that will be used when streams are connected with pipe(). Maybe it would have similar interface as JS flow control API.
Re: Overlap between StreamReader and FileReader
As we don't see any strong demand for flow control and sync read functionality, I've revised the proposal. Though we can separate state/error signaling from Stream and keep them done by each API (e.g. XHR) as Aymeric said, EoF signal still needs to be conveyed through Stream. enum StreamReadType { , blob, arraybuffer, text }; interface StreamConsumeResult { readonly attribute boolean eof; readonly any data; readonly unsigned long long size; }; [Constructor(optional DOMString mime)] interface Stream { readonly attribute DOMString type; // MIME type // Rejected on error. No more write op shouldn't be made. // // Fulfilled when the write completes. It doesn't guarantee that the written data has been // read out successfully. // // The contents of ArrayBufferView must not be modified until the promise is fulfilled. // // Fulfill may be delayed when the Stream considers itself to be full. // // write(), close() must not be called again until the Promise of the last write() is fulfilled. Promisevoid write((DOMString or ArrayBufferView or Blob)? data); void close(); attribute StreamReadType readType; attribute DOMString readEncoding; // read(), skip(), pipe() must not be called again until the Promise of the last read(), skip(), pipe() is fulfilled. // Rejected on error. No more read op shouldn't be made. // // If size is specified, // - if EoF: fulfilled with data up to EoF // - otherwise: fulfilled with data of size bytes // // If size is omitted, (all or part of) data available for read now will be returned. // // If readType is set to text, size of the result may be smaller than the value specified for the size argument. PromiseStreamConsumeResult read(optional [Clamp] long long size); // Rejected on error. Fulfilled on completion. // // .data of result is not used. .size of result is the skipped amount. PromiseStreamConsumeResult skip([Clamp] long long size); // .data is skipped size // Rejected on error. Fulfilled on completion. // // If size is omitted, transfer until EoF is encountered. // // .data of result is not used. .size of result is the size of data transferred. PromiseStreamConsumeResult pipe(Stream destination, optional [Clamp] long long size); };
Re: Overlap between StreamReader and FileReader
Did you see http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html ? Attempt to find a link between the data producers APIs and a Streams API like yours. Regards Aymeric Le 20/09/2013 15:16, Takeshi Yoshino a écrit : On Sat, Sep 14, 2013 at 12:03 AM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: I take this example to understand if this could be better with a built-in Stream flow control, if so, after you have defined the right parameters (if possible) for the streams flow control, you could process delta data while reading the file and restream them directly via WebSockets, and this would be great but again not sure that a universal solution can be found. I think what we can do is just providing helper to make it easier to build such an intelligent and app specific flow control logic. Maybe one of the points of your example is that we're not always be able to calculate good readableThreshold. I'm also not so sure how much of apps in the world can benefit from this kind of APIs. For consumers that can do flow control well on receive window basis, my API should work well (unnecessary events are not dispatched. chunks are consolidated. lazier ArrayBuffer creation). WebSocket has (broken) bufferedAmount attribute for window based flow control. Are you using it as a hint? -- Peersm : http://www.peersm.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms
Re: Overlap between StreamReader and FileReader
On Wed, Sep 25, 2013 at 12:41 AM, Aymeric Vitte vitteayme...@gmail.comwrote: Did you see http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0593.html ? Yes. This example seems to be showing how to connect only producer/consumer APIs which support Stream. Right? In such a case, all the flow control stuff would be basically hidden, and if necessary each producer/consumer/transformer/filter/etc. may expose flow control related parameter in their own form, and configure connected input/output streams accordingly. E.g. stream_xhr may choose to have large write buffer for performance, or have small one and make some backpressure to stream_ws1 for memory efficiency. My understanding is that the flow control APIs like mine are intended to be used by JS code implementing some converter, consumer, etc. while built-in stuff like WebCrypt would be evolved to accept Stream directly and handle flow control in e.g. C++ world. BTW, I'm discussing this to provide data points to decide whether to include flow control API or not. I'm not pushing it. I appreciate if other participants express opinions about this.
Re: Overlap between StreamReader and FileReader
On Sat, Sep 14, 2013 at 12:03 AM, Aymeric Vitte vitteayme...@gmail.comwrote: I take this example to understand if this could be better with a built-in Stream flow control, if so, after you have defined the right parameters (if possible) for the streams flow control, you could process delta data while reading the file and restream them directly via WebSockets, and this would be great but again not sure that a universal solution can be found. I think what we can do is just providing helper to make it easier to build such an intelligent and app specific flow control logic. Maybe one of the points of your example is that we're not always be able to calculate good readableThreshold. I'm also not so sure how much of apps in the world can benefit from this kind of APIs. For consumers that can do flow control well on receive window basis, my API should work well (unnecessary events are not dispatched. chunks are consolidated. lazier ArrayBuffer creation). WebSocket has (broken) bufferedAmount attribute for window based flow control. Are you using it as a hint?
Re: Overlap between StreamReader and FileReader
Here for the examples: http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html Simple ones leading to a simple Streams interface, I thought this was the spirit of the original Streams API proposal. Now you want a stream interface so you can code some js like mspack on top of it. I am still missing a part of the puzzle or how to use it: as you mention the stream is coming from somewhere (File, indexedDB, WebSocket, XHR, WebRTC, etc) you have a limited choice of APIs to get it, so msgpack will act on top of one of those APIs, no? (then back to the examples above) How can you get the data another way? Regards, Aymeric Le 13/09/2013 06:36, Takeshi Yoshino a écrit : On Fri, Sep 13, 2013 at 5:15 AM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: Isaac said too So, just to be clear, I'm **not** suggesting that browser streams copy Node streams verbatim.. I know. I wanted to kick the discussion which was stopped for 2 weeks. Unless you want to do node inside browsers (which would be great but seems unlikely) I still don't see the relation between this kind of proposal and existing APIs. What do you mean by existing APIs? I was thinking that we've been discussing what Stream read/write API for manual consuming/producing by JavaScript code should be like. Could you please give an example very different from the ones I gave already? Sorry, which mail? One of what I was imaging is protocol parsing. Such as msgpack, protocol buffer. It's good that ArrayBuffers of exact size is obtained. OTOH, as someone pointed out, Stream should have some flow control mechanism not to pull unlimited amount of data from async storage, network, etc. readableSize in my proposal is an example of how we make the limit controllable by an app. We could also depend on the size argument of read() call. But thinking of protocol parsing again, it's common that we have small fields such as 4, 8, 16 bytes. If read(size) is configured to pull size bytes from async storage, it's inefficient. Maybe we could have some hard coded limit, e.g. 1MiB and use max(hardCodedLimit, requestedReadSize). I'm fine with the latter. You have reverted to EventTarget too instead of promises, why? There was no intention to object against use of Promise. Sorry that I wasn't clear. I'm rather interested in receiving sequence of data as they become available (corresponds to Jonas's ChunkedData version read methods) with just one read call. Sorry that I didn't mention explicitly, but listeners on the proposed API came from ChunkedData object. I thought we can put them on Stream itself by giving up multiple read scenario. writeable/readableThreshold can be safely removed from the API if we agree it's not important. If the threshold stuff are removed, flush() and pull() will also be removed. -- jCore Email : avi...@jcore.fr Peersm : http://www.peersm.com iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
On Fri, Sep 13, 2013 at 6:08 PM, Aymeric Vitte vitteayme...@gmail.comwrote: Now you want a stream interface so you can code some js like mspack on top of it. I am still missing a part of the puzzle or how to use it: as you mention the stream is coming from somewhere (File, indexedDB, WebSocket, XHR, WebRTC, etc) you have a limited choice of APIs to get it, so msgpack will act on top of one of those APIs, no? (then back to the examples above) How can you get the data another way? Do you mean that those data producer APIs should be changed to provide read-by-delta-data, and manipulation of data by js code should happen there instead of at the output of Stream?
Re: Overlap between StreamReader and FileReader
Le 13/09/2013 14:23, Takeshi Yoshino a écrit : On Fri, Sep 13, 2013 at 6:08 PM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: Now you want a stream interface so you can code some js like mspack on top of it. I am still missing a part of the puzzle or how to use it: as you mention the stream is coming from somewhere (File, indexedDB, WebSocket, XHR, WebRTC, etc) you have a limited choice of APIs to get it, so msgpack will act on top of one of those APIs, no? (then back to the examples above) How can you get the data another way? Do you mean that those data producer APIs should be changed to provide read-by-delta-data, and manipulation of data by js code should happen there instead of at the output of Stream? Yes, exactly, except if you/someone see another way of getting the data inside the browser and turning the flow into a stream without using these APIs. Regards, Aymeric -- jCore Email : avi...@jcore.fr Peersm : http://www.peersm.com iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
Since I joined discussion recently, I don't know the original idea behind the Stream+XHR integration approach (response returns Stream object) as in current Streams API spec. But one advantage of it I come up with is that we can keep change to those producer APIs small. If we decide to add methods for example for flow control (though it's still under question), such stuff go to Stream, not XHR, etc.
Re: Overlap between StreamReader and FileReader
On Fri, Sep 13, 2013 at 9:50 PM, Aymeric Vitte vitteayme...@gmail.comwrote: Le 13/09/2013 14:23, Takeshi Yoshino a écrit : Do you mean that those data producer APIs should be changed to provide read-by-delta-data, and manipulation of data by js code should happen there instead of at the output of Stream? Yes, exactly, except if you/someone see another way of getting the data inside the browser and turning the flow into a stream without using these APIs. I agree that there're various states and things to handle for each of the producer APIs, and it might be judicious not to convey such API specific info/signal through Stream. I don't think it's bad to convert xhr.DONE to stream.close() manually as in your example http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html. But, regarding flow control, as I said in the other mail just posted, if we start thinking of flow control more seriously, maybe the right approach is to have unified flow control method and the point to define such a fine-grained flow control is Stream, not each API. If we're not, yes, maybe your proposal (deltaResponse) should be enough.
Re: Overlap between StreamReader and FileReader
Le 13/09/2013 15:11, Takeshi Yoshino a écrit : On Fri, Sep 13, 2013 at 9:50 PM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: Le 13/09/2013 14:23, Takeshi Yoshino a écrit : Do you mean that those data producer APIs should be changed to provide read-by-delta-data, and manipulation of data by js code should happen there instead of at the output of Stream? Yes, exactly, except if you/someone see another way of getting the data inside the browser and turning the flow into a stream without using these APIs. I agree that there're various states and things to handle for each of the producer APIs, and it might be judicious not to convey such API specific info/signal through Stream. I don't think it's bad to convert xhr.DONE to stream.close() manually as in your example http://lists.w3.org/Archives/Public/public-webapps/2013JulSep/0453.html. But, regarding flow control, as I said in the other mail just posted, if we start thinking of flow control more seriously, maybe the right approach is to have unified flow control method and the point to define such a fine-grained flow control is Stream, not each API. Maybe, I was not at the start of this thread too so I don't know exactly what was the original idea (and hope I am not screwing it up here). But I am not sure it's possible to define a universal flow control. Example: I am currently experiencing some flow control issues for project [1], basically the sender reads a file AsArrayBuffer from indexedDB where it's stored as a Blob. Since we can not get delta data while reading the File for now, the sender waits for having the whole ArrayBuffer, then slices it, processes the blocks and sends them via WebSockets. If you implement a basic loop, of course you overload the sender's UA and connection. So the system makes some calculation in order to allow only half of the bandwidth to be used, aggregate the blocks until it finds out that the size of the aggregation meets the bandwidth requirement for the aggregated blocks to be sent every 50ms. Then it uses a poor setTimeout to flush the data which screw up all the preceding calculations since setTimeout fires whenever it likes. Maybe there are smarter ways to do this, I was thinking to use workers so you can get a more precise clock via postMessages but I did not try. In addition to the bandwidth control there is a window for flow control. I take this example to understand if this could be better with a built-in Stream flow control, if so, after you have defined the right parameters (if possible) for the streams flow control, you could process delta data while reading the file and restream them directly via WebSockets, and this would be great but again not sure that a universal solution can be found. If we're not, yes, maybe your proposal (deltaResponse) should be enough. What is sure is that delta data should be made available instead of incremental ones. [1] http://www.peersm.com -- jCore Email : avi...@jcore.fr Peersm : http://www.peersm.com iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
Apparently we are not talking about the same thing, while I am thinking to a high level interface your interface is taking care of the underlying level. Like node's streams, node had to define it since it was not existing (but is someone using node's streams as such or does everybody use the higher levels (net, ssl/tls, http, https)?), I have been working since quite some time on projects streaming things in all possible ways inside browsers or with node and I never felt any need for such a proposal. So, to understand where the mismatch comes from, could you please highlight a web use case/code example based on your proposal? Regards, Aymeric Le 11/09/2013 18:14, Takeshi Yoshino a écrit : I forgot to add an attribute to specify the max size of backing store. Maybe it should be added to the constructor. On Wed, Sep 11, 2013 at 11:24 PM, Takeshi Yoshino tyosh...@google.com mailto:tyosh...@google.com wrote: any peek(optional [Clamp] long long size, optional [Clamp] long long offset); peek with offset doesn't make sense for text mode reading. Some exception should be throw for that case. - readableSize attribute returns (number of readable bytes as of the last time the event loop started executing a task) - (bytes consumed by read() method). + (bytes added by write() and transferred to read buffer synchronously) The concept of this interface is - to allow bulk transfer from internal asynchronous storage (e.g. network, disk based backing store) to JS world but delay conversion (e.g. into DOMString, ArrayBuffer). - not to ask an app to do such transfer explicitly -- jCore Email : avi...@jcore.fr Peersm : http://www.peersm.com iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
On Thu, Sep 12, 2013 at 10:58 PM, Aymeric Vitte vitteayme...@gmail.comwrote: Apparently we are not talking about the same thing, while I am thinking to a high level interface your interface is taking care of the underlying level. How much low level stuff to expose would basically affect high level interface design, I think. Like node's streams, node had to define it since it was not existing (but is someone using node's streams as such or does everybody use ...snip... So, to understand where the mismatch comes from, could you please highlight a web use case/code example based on your proposal? I'm still thinking how much we should include in the API, too. This proposal is just a trial to address the requirements Isaac listed. So, each feature should correspond to some of his example code.
Re: Overlap between StreamReader and FileReader
On Fri, Sep 13, 2013 at 5:15 AM, Aymeric Vitte vitteayme...@gmail.comwrote: Isaac said too So, just to be clear, I'm **not** suggesting that browser streams copy Node streams verbatim.. I know. I wanted to kick the discussion which was stopped for 2 weeks. Unless you want to do node inside browsers (which would be great but seems unlikely) I still don't see the relation between this kind of proposal and existing APIs. What do you mean by existing APIs? I was thinking that we've been discussing what Stream read/write API for manual consuming/producing by JavaScript code should be like. Could you please give an example very different from the ones I gave already? Sorry, which mail? One of what I was imaging is protocol parsing. Such as msgpack, protocol buffer. It's good that ArrayBuffers of exact size is obtained. OTOH, as someone pointed out, Stream should have some flow control mechanism not to pull unlimited amount of data from async storage, network, etc. readableSize in my proposal is an example of how we make the limit controllable by an app. We could also depend on the size argument of read() call. But thinking of protocol parsing again, it's common that we have small fields such as 4, 8, 16 bytes. If read(size) is configured to pull size bytes from async storage, it's inefficient. Maybe we could have some hard coded limit, e.g. 1MiB and use max(hardCodedLimit, requestedReadSize). I'm fine with the latter. You have reverted to EventTarget too instead of promises, why? There was no intention to object against use of Promise. Sorry that I wasn't clear. I'm rather interested in receiving sequence of data as they become available (corresponds to Jonas's ChunkedData version read methods) with just one read call. Sorry that I didn't mention explicitly, but listeners on the proposed API came from ChunkedData object. I thought we can put them on Stream itself by giving up multiple read scenario. writeable/readableThreshold can be safely removed from the API if we agree it's not important. If the threshold stuff are removed, flush() and pull() will also be removed.
Re: Overlap between StreamReader and FileReader
Isaac said too So, just to be clear, I'm *not* suggesting that browser streams copy Node streams verbatim.. Unless you want to do node inside browsers (which would be great but seems unlikely) I still don't see the relation between this kind of proposal and existing APIs. Could you please give an example very different from the ones I gave already? WebCrypto seems to be waiting for a Streams interface to be able to perform simple progressive operations, which have been (unexpectedly) removed from the spec, with outstanding features like the stream itself being able to predict its end... I don't think it's required and even possible, streams inside browsers only need to handle delta data, the rest being handled by the APIs using the streams (including end of the stream, flow control co), cf my simple proposal. You have reverted to EventTarget too instead of promises, why? Regards Aymeric Le 12/09/2013 20:36, Takeshi Yoshino a écrit : On Thu, Sep 12, 2013 at 10:58 PM, Aymeric Vitte vitteayme...@gmail.com mailto:vitteayme...@gmail.com wrote: Apparently we are not talking about the same thing, while I am thinking to a high level interface your interface is taking care of the underlying level. How much low level stuff to expose would basically affect high level interface design, I think. Like node's streams, node had to define it since it was not existing (but is someone using node's streams as such or does everybody use ...snip... So, to understand where the mismatch comes from, could you please highlight a web use case/code example based on your proposal? I'm still thinking how much we should include in the API, too. This proposal is just a trial to address the requirements Isaac listed. So, each feature should correspond to some of his example code. -- jCore Email : avi...@jcore.fr Peersm : http://www.peersm.com iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
Here's my all-in-one strawman proposal including some new stuff for flow control. Yes, it's too big, but may be useful for glancing what features are requested. enum StreamReadType { , arraybuffer, text }; [Constructor(optional DOMString mime, optional [Clamp] long long writeBufferSize, optional [Clamp] long long readBufferSize)] interface Stream : EventTarget { readonly attribute DOMString type; // MIME type // Writing interfaces readonly attribute unsigned long long writableSize; // Bytes can be written synchronously attribute unsigned long long writeBufferSize; attribute EventHandler onwritable; attribute unsigned long long writableThreshold; attribute EventHandler onpulled; attribute EventHandler onreadaborted; void write((DOMString or ArrayBufferView or Blob)? data); void flush(); void closeWrite(); void abortWrite(); // Reading interfaces attribute StreamReadType readType; // Must not be set after the first read() attribute DOMString readEncoding; readonly attribute unsigned long long readableSize; // Bytes can be read synchronously attribute unsigned long long readBufferSize; attribute EventHandler onreadable; attribute unsigned long long readableThreshold; attribute EventHandler onflush; attribute EventHandler onclose; // Receives clean flag any read(optional [Clamp] long long size); any peek(optional [Clamp] long long size, optional [Clamp] long long offset); void skip([Clamp] long long size); void pull(); void abortRead(); // Async interfaces attribute EventHandler ondoneasync; // Receives bytes skipped or Blob or undefined (when done pipeTo) void readAsBlob(optional [Clamp] long long size); void longSkip([Clamp] long long size); void pipeTo(Stream destination, optional [Clamp] long long size); }; - Encoding for text mode reading is determined by the type attribute. Can be overridden by setting the readEncoding attribute. - Invoking read() repeatedly to pull data into the stream is annoying. So, instead I used writable/readableThreshold approach. - Not to bloat the API anymore, limited error/close signaling interface to only EventHandlers. - stream.read() means stream.read(stream.readableSize). - After onclose invocation, it's guaranteed that all written bytes are available for read. - read() is non-blocking. It returns only what is synchronously readable. If there isn't enough bytes (investigate the readableSize attribute), an app should wait until the next invocation of onreadable. readBufferSize and readableThreshold may be modified accordingly and pull() may be called. - stream.read(size) returns an ArrayBuffer or DOMString of min(size, stream.readableSize) bytes that is synchronously readable now. - When readType is set to text, read() throws an EncodingError if an invalid sequence is found. Incomplete sequence will be left unconsumed. If there's an incomplete sequence at the end of stream, the app can know that by checking the size attribute after onclose invocation and read() call. - readableSize attribute returns (number of readable bytes as of the last time the event loop started executing a task) - (bytes consumed by read() method). - onflush is separated from onreadable since it's possible that an intermediate Stream in a long change has no data to flush but the next or later Streams have. - Invocation order is onreadable - onflush or onclose. - Flush handling code must be implemented on both onflush and onclose. On close() call, only onclose is invoked to reduce event propagation cost. - Pass read/writeBufferSize of -1 to constructor or set -1 to stream.read/writeBufferSize for unlimited buffering. - Instead of having write(buffer, cb), made write() accept any size of data regardless of writeBufferSize. XHR should respect writeBufferSize and write only writableSize bytes of data and set onwritable if necessary and possibly also set writableThreashold. - {writable,readable}Threshold are 0 by default meaning that onwritable and onreadable are invoked immediately when there's space/data available. - If {writable,readable}Threshold are greater than capacity, it's considered to be set to capacity. - onwritable/onreadable is invoked asynchronously when -- new space/data is available as a result of read()/write() operation that satisfies writable/readableThreshold onreadable is invoked asynchronously when -- flush()-ed or close()-ed - onwritable/onreadable is invoked synchronously when -- onwritable/onreadable is updated and there's space/data available that satisfies writable/readableThreshold -- writable/readableThreshold is updated and there's space/data available that satisfies the new writable/readableThreshold -- new space/data is available as a result of updating capacity that satisfies writable/readableThreshold
Re: Overlap between StreamReader and FileReader
I forgot to add an attribute to specify the max size of backing store. Maybe it should be added to the constructor. On Wed, Sep 11, 2013 at 11:24 PM, Takeshi Yoshino tyosh...@google.comwrote: any peek(optional [Clamp] long long size, optional [Clamp] long long offset); peek with offset doesn't make sense for text mode reading. Some exception should be throw for that case. - readableSize attribute returns (number of readable bytes as of the last time the event loop started executing a task) - (bytes consumed by read() method). + (bytes added by write() and transferred to read buffer synchronously) The concept of this interface is - to allow bulk transfer from internal asynchronous storage (e.g. network, disk based backing store) to JS world but delay conversion (e.g. into DOMString, ArrayBuffer). - not to ask an app to do such transfer explicitly
Re: Overlap between StreamReader and FileReader
On Fri, Aug 23, 2013 at 2:41 AM, Isaac Schlueter i...@izs.me wrote: 1. Drop the read n bytes part of the API entirely. It is hard to do I'm ok with that. But then, instead we need to evolve ArrayBuffer to have powerful concat/slice functionality for performance. Re: slicing, we can just make APIs to accept ArrayBufferView. How should we deal with concat operation? You suggested that we add unshift(), but repeating read and unshift until we get enough data sound not so good. For example, currently TextDecoder (http://encoding.spec.whatwg.org/) accepts one ArrayBufferView and outputs one DOMString. We can use stream mode of TextDecoder to get multiple output DOMStrings and then concatenate them to get the final result. As we still don't have StringBuilder, it's not considered to be a big deal to have ArrayBufferBuilder (Stream.read(size) is kinda ArrayBuffer builder)? Is any of you guys thinking about introducing something like Node.js's Buffer class for decoding and tokenization? TextDecoder+Stream would be a kind of such classes. I also considered making read() operation to accept pre-allocated ArrayBuffer and return the number of bytes written. stream.read(buffer) If written data is insufficient, the user can continue to pass the same buffer to fill the unused space. But, since DOMString is immutable, we can't take the same approach for readText() op. see in Node), and complicates the internal mechanisms. People think they need it, but what they really need is readUntil(delimiterChar). What if implementing length header based protocol, e.g. msgpack? 2. Reading strings vs ArrayBuffers or other types of things MUST be a property of the stream, Fixed property or mutable via readType attribute? If readType, the sequence of UTF8/binary mixed read() problem remains. 3. Sync vs async read(). Let's dig into the issue of `var d = s.read()` vs `s.read(function(d) {})` for getting data out of a stream. ...snip... buffering to occur if you have pipe chains of streams that are processing at different speeds, where one is bursty and the other is consistent. Clarification. You're saying that always posting cb to task queue is wasteful. Right? Anyway, I think it makes sense. If read is designed to invoke cb synchronously, it'll be difficult to avoid stack overflow. So the only options is to always run cb in the next task. stream.poll(function ondata() { What happens if unshift() is called? poll() invokes ondata() only when new data (unshift()-ed data is not included) is available? var d = stream.read(); while (stream.state === 'OK') { processData(d); d = stream.read(); } Is Jonas right about the reason why we need loop here? I.e. to avoid automatic merge/serialization of buffered chunks? switch (stream.state) { case 'EOF': onend(); break; case 'EWOULDBLOCK': stream.poll(ondata); break; default: onerror(new Error('Stream read error: ' + stream.state)); We could distinguish these three states by null, empty ArrayBuffer/DOMString, and non-empty ArrayBuffer/DOMString? ReadableStream.prototype.readAll = function(onerror, ondata, onend) { onpoll(); function onpoll() { If we decide not to allow multiple concurrent read operations on a stream, can we just use event handler approach. stream.onerror = ... stream.ondata = ... 4. Passive data listening. In Node v0.10, it is not possible to passively listen to the data passing through a stream without affecting the state of the stream. This is corrected in v0.12, by making the read() method also emit a 'data' event whenever it returns data, so v0.8-style APIs work as they used to. The takeaway here is not to do what Node did, but to learn what Node learned: the passive-data-listening use-case is relevant. What's the use case? 5. Piping. It's important to consider how any proposed readable stream API will allow one to respond to backpressure, and how it relates to a *writable* stream API. Data management from a source to a destination is the fundamental reason d'etre for streams, after all. I'd have onwritable and onreadable handler, make their threshold configurable and let pipe to setup them.
Re: Overlap between StreamReader and FileReader
On Fri, Aug 9, 2013 at 12:47 PM, Isaac Schlueter i...@izs.me wrote: Jonas, What does *progress* mean here? So, you do something like this: var p = stream.read() to get a promise (of some sort). That read() operation is (if we're talking about TCP or FS) a single operation. There's no 50% of the way done reading moment that you'd care to tap into. Even from a conceptual point of view, the data is either: a) available (and the promise is now fulfilled) b) not yet available (and the promise is not yet fulfilled) c) known to *never* be available because: i) we've reached the end of the stream (and the promise is fulfilled with some sort of EOF sentinel), or ii) because an error happened (and the promise is broken). So.. where's the progress? A single read() operation seems like it ought to be atomic to me, and indeed, the read[2] function either returns some data (a), no data (c-i), raises EWOUDLBLOCK (b), or raises some other error (c-ii). But, whichever of those it does, it does right away. We only get woken up again (via epoll/kqueue/CPIO/etc) once we know that the file descriptor (or HANDLE in windows) is readable again (and thus, it's worthwhile to attempt another read[2] operation). Hi Isaac, Sorry for taking so long to respond. It took me a while to understand where the disconnect came from. I also needed to mull over how a consumer actually is likely to consume data from a Stream. Having looked over the Node.js API more I think I see where the misunderstanding is coming from. The source of confusion is likely that Node.js and the proposal in [1] are very different. Specificially, in Node.js the read() operation is synchronous and operates on currently buffered data. In [1] the read() operation is asynchronous and isn't restricted to just the currently buffered data. From my point of view there are two rough categories of ways of reading data from an asynchronous Stream: A) The Stream hands data to the consumer as soon as the data is available. I.e. the stream doesn't buffer data longer than until the next opportunity to fire a callback to the consumer. B) The Stream allows the consumer to pull data out of the stream at whatever pace, and in whatever block size, that the consumer finds appropriate. If the data isn't yet available a callback is used to notify the consumer when it is. A is basically the Stream pushing the data to the consumer. And B is the consumer pulling the data from the Stream. In Node.js doing A looks something like: stream.on('readable', function() { var buffer; while((buffer = stream.read())) { processData(buffer); } }); In the proposal in [1] you would do this with the following code stream.readBinaryChunked().ondata = function(e) { processData(e.data); } (side-note: it's unclear to me why the Node.js API forces readable.read() to be called in a loop. Is that to avoid having to flatten internal buffer fragments? Without that the two would essentially be the same with some minor syntactical differences) Here it definitely doesn't make sense to deliver progress notifications. Rather than delivering a progress notification to the consumer, you simply deliver the data. The way you would do B in Node.js looks something like: stream.on('readable', function() { var buffer; if ((buffer = stream.read(10))) { processTenBytes(buffer); } }); The same thing using the proposal in [1] looks like stream.readBinary(10).then(function(buffer) { processTenBytes(buffer); }); An important difference here is that in the Node.js API, the 'read 10 bytes' operation either immediately returns a result, or it immediately fails, depending on how much data we currently have buffered. I.e. the read() call is synchronous. The caller is expected to keep calling read(10) until the call succeeds. Though of course there's also a very useful callback which makes the calling again very easy. But between the calls to read() the Stream doesn't really have knowledge that someone is waiting to read 10 bytes of information. The API in [1] instead makes the read() call asynchronous. That means that we can always let the call succeed (unless there's an error on the stream of course). If we don't have enough data buffered currently, we simply call the success callback later than if we had had all requested data buffered already. This is the place where delivering progress notifications could also be done, though by no means this is an important aspect of the API. But since the read() operation is asynchronous, we can deliver progress notifications as we buffer up enough data to fulfill it. I hope that makes it more clear how progress notifications play in. So to be clear, progress notifications is by no means the important difference here. The important difference is whether we make read() be synchronous and operating on the current buffered data, or if we make it asynchronous and operating on the full data stream. As far as I can tell
Re: Overlap between StreamReader and FileReader
On Fri, Aug 9, 2013 at 7:36 PM, Domenic Denicola dome...@domenicdenicola.com wrote: Another way of looking at it, is that a streaming API is itself incremental and cancellable. It makes no sense to say that each read from or write to the stream is *also* incremental and cancellable; why introduce another layer of entirely-unnecessary depth before you reach the atomic level of non-incremental, non-cancellable reads/writes? What use case does that serve? I'm pretty sold on the argument that making individual reads cancellable is a bad idea. But note that the original proposal does not make individual reads incremental. Progress events is not the same thing as incremental. I really think talking about progress notifications being or not being is focusing on the wrong questions. Nothing would substantially change if we made the original proposal return Promises rather than ProgressPromises. / Jonas
Re: Overlap between StreamReader and FileReader
Le 22/08/2013 09:28, Jonas Sicking a écrit : Does anyone have examples of code that uses the Node.js API? I'd love to look at how people practically end up consuming data? I am doing something like this: var parse=function() { //process this.stream_ this.queue_.shift(); if (this.queue_.length) { this.queue_[0](); }; }; var process=function(data) { return function() { this.stream_=[this.stream_,data].concatBuffers(); parse.call(this); }; }; var on_data=function(data) { this.queue_=this.queue_||[]; this.queue_.push(process(data).bind(this)); if (this.queue_.length===1) { this.queue_[0](); }; }; request.on('data',function(data) { on_data.call(this,data); }); I don't remember exactly if it's due to my implementation or node (because I am using both node's Buffers and Typed Arrays) but I experienced some problems where data was modified while it was being processed, that's why this.stream_ is freezing the data received (with remaining bytes received earlier, see next sentence) until it is processed. Coming back to my previous TextEncoder/Decoder remark for utf-8 parsing, I don't know how to do this with native node functions. Regards Aymeric -- jCore Email : avi...@jcore.fr iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
So, just to be clear, I'm *not* suggesting that browser streams copy Node streams verbatim. In Node.js doing A looks something like: stream.on('readable', function() { var buffer; while((buffer = stream.read())) { processData(buffer); } }); Not quite. In Node.js, doing A looks like: stream.on('data', processData); In my opinion, marrying a browser stream implementation to an EventEmitter abstraction would be a mistake. I also think that marrying it to a Promise implementation would be a mistake. As popular as Promises are, they are an additional layer of abstraction that is not fundamentally related to streaming data, and it is trivial to turn: obj.method(fn) into: obj.method().then(fn) at the user/library level. This allows performance-critical applications to avoid any unnecessary complexity and shorten their code paths as much as possible, but is easily extended for those who prefer promises (or generators, or coroutines, or what have you.) Despite what you may see on twitter or mailing lists, the choice to use this minimal abstraction for Node's asynchrony has allowed all these different things to coexist rather peacefully, and I believe that it is a great success. Even if you feel that promises or generators are the best thing since generational garbage collection (and certainly, both have their merits), I think it is worth exploring where such a constraint would lead us. So far in this conversation, I've been mostly just trying to figure out what the state of things is, and pointing out what I see as potential hazards. Here are some pro-active suggestions, but this is still not anything I'm particularly in love with, so treat it only as an exploration of the problem space. 1. Drop the read n bytes part of the API entirely. It is hard to do in a way that makes sense for both binary and string streams (as we see in Node), and complicates the internal mechanisms. People think they need it, but what they really need is readUntil(delimiterChar). And, that's trivial to implement on top of unshift(). So let's just add unshift(chunk). 2. Reading strings vs ArrayBuffers or other types of things MUST be a property of the stream, not of the read() call. Having readBinary() and readUtf8(), or read(encoding), is a terrible idea which bloats the API surface and exposes multibyte landmines. The easiest way to do this is to make the API agnostic as to the specific data type returned. If we ditch read(number of bytes), then this becomes much simpler, and also allows for things like streaming JSON parsers that return arbitrary JavaScript objects. 3. Sync vs async read(). Let's dig into the issue of `var d = s.read()` vs `s.read(function(d) {})` for getting data out of a stream. The problem is that this assumes that each call to read() will be done when there is no data buffered, and will result in a call to the underlying system, requiring some async stuff. However, that's not always the case, and a lot of times, you actually want a bit of buffering to occur if you have pipe chains of streams that are processing at different speeds, where one is bursty and the other is consistent. For example, consider a situation where you're interacting with a local database, and also a 3g network connection. The 3g connection will be either very fast or completely not moving, and the local database will be relatively stable. If you're reading the data from the network connection, and putting it into the db, you don't want to pause the 3g connection unnecessarily and miss a potential burst just because you were waiting for the db. You *also* don't want those bursts to overwhelm your buffer, of course. The solution for this is to have some pre-defined buffer in the stream implementation, so that you only pause the bursty stream if the slow-and-steady stream can't keep up. If you have a readable stream which is buffering its data in memory, then doing `s.read(cb)` is always going to be strictly more expensive than doing `var d = s.read()`. The only way to make an async read not more expensive than a sync returning read is for the callback to be inline-able, and called immediately. However, this means that it is no longer possible to reason about any particular read() call, and so this releases Zalgo. For example: console.log('before'); stream.read(function(data) { console.log('got data') }); console.log('after'); The ordering of logs must be predictable, which means that we must *always* defer the callback's execution until at least the end of the current run-to-completion. This isn't free. This problem could potentially be solved if we used synchronous reads, but mirrored the epoll-like behavior more closely than Node.js does today, without the read(n) mistake. ``` stream.poll(function ondata() { var d = stream.read(); while (stream.state === 'OK') { processData(d); d = stream.read(); } switch (stream.state) { case 'EOF':
Re: Overlap between StreamReader and FileReader
On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote: I believe the term is congestion control such as the TCP congestion control algorithm. As I've heard the term used, congestion control is slightly different than flow control or tcp backpressure, but they are related concepts, and yes, your point is dead-on, Austin, this is absolutely 100% essential. Any Stream API that treats backpressure as an issue to handle later is not a Stream API, and is clearly not ready to even bother discussing. On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote: I think there's some confusion as to what the abort() call is going to do exactly. Yeah, I'm rather confused by that as well. A read[2] operation typically can't be canceled because it's synchronous. Let's back up just a step here, and talk about the fundamental purpose of an API like this. Here's a strawman: - A Readable Stream is an abstraction representing an ordered set of data which may or may be finite, some or all of which may arrive at a future time, which can be consumed at any arbitrary rate up to the rate at which data is arriving, without causing excessive memory use. It provides a mechanism to send the data into a Writable Stream, and for being alerted to errors in the underlying implementation. A Writable Stream is an abstraction representing a destination where data is written, where any given write operation may be completely flushed to the underlying implementation immediately or at some point in the future. It provides a mechanism for determining when more data can be safely written without causing excessive memory usage, and for being alerted to errors in the underlying implementation. A Duplex Stream is an abstraction that implements both the Readable Stream and Writable Stream interfaces. There may or may not be any specific connection between the two sets of functionality. (For example, it may represent a tcp socket file descriptor, or any arbitrary readable/writable API that one can imagine.) - For any stream implementation, I typically try to ask: How would you build a non-blocking TCP implementation using this abstraction? This might just be my bias coming from Node.js, but I think it's a fair test of a Stream API that will be used on the web, where TCP is the standard. Here are some things that need to work 100%, assuming a Readable.pipe(Writable) method: fastReader.pipe(slowWriter) slowReader.pipe(fastWriter) socket.pipe(socket) // echo server socket.pipe(new gzipDeflate()).pipe(socket) socket.pipe(new gzipInflate()).pipe(socket) Node's streams, as of 0.11.5, are pretty good. However, they've evolved rather than having been intelligently designed, so in many areas, the API surface is not as elegant as it could be. In particular, I think that relying on an EventEmitter interface is an unfortunate choice that should not be repeated in this specification. The language has new features, and Promises are somewhat well-understood now (and weren't as much then). But Node streams have definitely got a lot of play-testing that we can lean on when designing something better. Calling read() repeatedly is much less convenient than doing something like `stream.on('data', doSomething)`. Additionally, you often want to spy on a Stream, and get access to its data chunks as they come in, without being the main consumer of the Stream.
Re: Overlap between StreamReader and FileReader
On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote: On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Takeshi Yoshino [mailto:tyosh...@google.com] On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola dome...@domenicdenicola.com wrote: Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html I'm drafting [1] based on [2] and summarizing comments on this list in order to build up concrete algorithm and get consensus on it. Great! Can you explain why this needs to return an AbortableProgressPromise, instead of simply a Promise? All existing stream APIs (as prototyped in Node.js and in other environments, such as in js-git's multi-platform implementation) do not signal progress or allow aborting at the during a chunk level, but instead count on you recording progress by yourself depending on what you've seen come in so far, and aborting on your own between chunks. This allows better pipelining and backpressure down to the network and file descriptor layer, from what I understand. Can you explain what you mean by This allows better pipelining and backpressure down to the network and file descriptor layer? I believe the term is congestion control such as the TCP congestion control algorithm. That is, don't send data to the application faster than it can parse it or pass it off, or otherwise some mechanism to allow the application to throttle down the incoming flow, essential to any networked application like the Web. I don't think that congestion control is affected by progress notifications at all. And it is certainly not affected by if the progress notifications fire from the Promise object or from another object. Progress notifications doesn't affect when or how data is being read. It only tells you about the reads that other APIs are doing. I think there's some confusion as to what the abort() call is going to do exactly. This is a good question. I.e. does calling abort() on a Promise returned from Stream.read() only cancel that read, or does it also cancel the whole Stream? I could definitely see that as an argument for returning ProgressPromise rather than AbortableProgressPormise from Stream.read() and instead sticking an abort() function on Stream. In any case, this seems like an orthogonal issue to progress notifications being or not being. / Jonas
Re: Overlap between StreamReader and FileReader
Jonas, What does *progress* mean here? So, you do something like this: var p = stream.read() to get a promise (of some sort). That read() operation is (if we're talking about TCP or FS) a single operation. There's no 50% of the way done reading moment that you'd care to tap into. Even from a conceptual point of view, the data is either: a) available (and the promise is now fulfilled) b) not yet available (and the promise is not yet fulfilled) c) known to *never* be available because: i) we've reached the end of the stream (and the promise is fulfilled with some sort of EOF sentinel), or ii) because an error happened (and the promise is broken). So.. where's the progress? A single read() operation seems like it ought to be atomic to me, and indeed, the read[2] function either returns some data (a), no data (c-i), raises EWOUDLBLOCK (b), or raises some other error (c-ii). But, whichever of those it does, it does right away. We only get woken up again (via epoll/kqueue/CPIO/etc) once we know that the file descriptor (or HANDLE in windows) is readable again (and thus, it's worthwhile to attempt another read[2] operation). Now, it *might* make sense to say that the entire Stream as a whole is a ProgressPromise of sorts. But, since you often don't know the eventual size of the data ahead of time (and indeed, it will often be unbounded), progress is an odd concept in this context. Are you proposing that every step in the TCP dance is somehow exposed on promise returned by read()? That seems rather inconvenient and unnecessary, not to mention difficult to implement, since the TCP stack is typically in kernel space. On Fri, Aug 9, 2013 at 11:45 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Aug 8, 2013 at 7:40 PM, Austin William Wright a...@bzfx.net wrote: On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Takeshi Yoshino [mailto:tyosh...@google.com] On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola dome...@domenicdenicola.com wrote: Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html I'm drafting [1] based on [2] and summarizing comments on this list in order to build up concrete algorithm and get consensus on it. Great! Can you explain why this needs to return an AbortableProgressPromise, instead of simply a Promise? All existing stream APIs (as prototyped in Node.js and in other environments, such as in js-git's multi-platform implementation) do not signal progress or allow aborting at the during a chunk level, but instead count on you recording progress by yourself depending on what you've seen come in so far, and aborting on your own between chunks. This allows better pipelining and backpressure down to the network and file descriptor layer, from what I understand. Can you explain what you mean by This allows better pipelining and backpressure down to the network and file descriptor layer? I believe the term is congestion control such as the TCP congestion control algorithm. That is, don't send data to the application faster than it can parse it or pass it off, or otherwise some mechanism to allow the application to throttle down the incoming flow, essential to any networked application like the Web. I don't think that congestion control is affected by progress notifications at all. And it is certainly not affected by if the progress notifications fire from the Promise object or from another object. Progress notifications doesn't affect when or how data is being read. It only tells you about the reads that other APIs are doing. I think there's some confusion as to what the abort() call is going to do exactly. This is a good question. I.e. does calling abort() on a Promise returned from Stream.read() only cancel that read, or does it also cancel the whole Stream? I could definitely see that as an argument for returning ProgressPromise rather than AbortableProgressPormise from Stream.read() and instead sticking an abort() function on Stream. In any case, this seems like an orthogonal issue to progress notifications being or not being. / Jonas
RE: Overlap between StreamReader and FileReader
Isaac has essentially explained what I was getting at earlier, except much more clearly. When I said this allows better pipelining and backpressure down to the network and file descriptor layer, I was essentially saying implementing read or write operations as cancellable and incremental does not fit well with making them atomic operations that can fit into architecture of streams with flow-control. And, as Isaac again eloquently pointed out, streams without flow control are not streams at all. (You're Missing the Point of Streams, anyone? :P) Another way of looking at it, is that a streaming API is itself incremental and cancellable. It makes no sense to say that each read from or write to the stream is *also* incremental and cancellable; why introduce another layer of entirely-unnecessary depth before you reach the atomic level of non-incremental, non-cancellable reads/writes? What use case does that serve?
RE: Overlap between StreamReader and FileReader
From: Takeshi Yoshino [mailto:tyosh...@google.com] On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola dome...@domenicdenicola.com wrote: Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html I'm drafting [1] based on [2] and summarizing comments on this list in order to build up concrete algorithm and get consensus on it. Great! Can you explain why this needs to return an AbortableProgressPromise, instead of simply a Promise? All existing stream APIs (as prototyped in Node.js and in other environments, such as in js-git's multi-platform implementation) do not signal progress or allow aborting at the during a chunk level, but instead count on you recording progress by yourself depending on what you've seen come in so far, and aborting on your own between chunks. This allows better pipelining and backpressure down to the network and file descriptor layer, from what I understand.
RE: Overlap between StreamReader and FileReader
From: Takeshi Yoshino [tyosh...@google.com] Sorry, which one? stream.Readable's readable event and read method? Exactly. I agree flow control is an issue not addressed well yet and needs to be fixed. I would definitely suggest thinking about it as soon as possible, since it will likely have a significant effect on the overall API. For example, all this talk of standardizing ProgressPromise (much less AbortableProgressPromise) will likely fall by the wayside once you consider how it hurts flow control.
Re: Overlap between StreamReader and FileReader
On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Takeshi Yoshino [mailto:tyosh...@google.com] On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola dome...@domenicdenicola.com wrote: Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html I'm drafting [1] based on [2] and summarizing comments on this list in order to build up concrete algorithm and get consensus on it. Great! Can you explain why this needs to return an AbortableProgressPromise, instead of simply a Promise? All existing stream APIs (as prototyped in Node.js and in other environments, such as in js-git's multi-platform implementation) do not signal progress or allow aborting at the during a chunk level, but instead count on you recording progress by yourself depending on what you've seen come in so far, and aborting on your own between chunks. This allows better pipelining and backpressure down to the network and file descriptor layer, from what I understand. Can you explain what you mean by This allows better pipelining and backpressure down to the network and file descriptor layer? I definitely agree that we don't want to cause too large performance overheads. But it's not obvious to me how performance is affected by putting progress and/or aborting functionality on the returned Promise instance, rather than on a separate object (which you suggested in another thread). We should absolutely learn from Node.js and other environments. Do you have any pointers to discussions about why they didn't end up with progress in their read a chunk API? / Jonas
Re: Overlap between StreamReader and FileReader
On Thu, Aug 8, 2013 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Aug 8, 2013 at 6:42 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Takeshi Yoshino [mailto:tyosh...@google.com] On Thu, Aug 1, 2013 at 12:54 AM, Domenic Denicola dome...@domenicdenicola.com wrote: Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html I'm drafting [1] based on [2] and summarizing comments on this list in order to build up concrete algorithm and get consensus on it. Great! Can you explain why this needs to return an AbortableProgressPromise, instead of simply a Promise? All existing stream APIs (as prototyped in Node.js and in other environments, such as in js-git's multi-platform implementation) do not signal progress or allow aborting at the during a chunk level, but instead count on you recording progress by yourself depending on what you've seen come in so far, and aborting on your own between chunks. This allows better pipelining and backpressure down to the network and file descriptor layer, from what I understand. Can you explain what you mean by This allows better pipelining and backpressure down to the network and file descriptor layer? I believe the term is congestion control such as the TCP congestion control algorithm. That is, don't send data to the application faster than it can parse it or pass it off, or otherwise some mechanism to allow the application to throttle down the incoming flow, essential to any networked application like the Web. I think there's some confusion as to what the abort() call is going to do exactly.
Re: Overlap between StreamReader and FileReader
On Tue, Jul 30, 2013 at 10:27 PM, Takeshi Yoshino tyosh...@google.com wrote: On Tue, Jul 30, 2013 at 12:07 PM, Jonas Sicking jo...@sicking.cc wrote: could contain an encoding. Then when stream.readText is called, if there's an explicit encoding, it would use that encoding when Do you think it should also have overrideMimeType like XHR? I think that use case is rare enough that we can solve it by letting the author create a new Stream object, which presumably would allow specifying a type for that stream, and then feed that new stream the contents of the old stream. OK. One question about readText is what size should mean and how to handle incomplete chunk. a) maxSize means the size of DOMString. readText reads data until it builds DOMString of maxSize or reached EoF b) maxSize means the size of raw bytes b-i) buffer incomplete bytes for next read b-ii) fail if decoder didn't consume all read data (of maxSize bytes) b-ii) is simple but inconvenient. user need to know the number of bytes the next text data occupies in advance maybe b-i) and in case read() is issued after incomplete readText, an exception should be thrown. This is kind of mutual exclusiveness Anne was worrying about. This is an excellent question. I have the same reaction regarding b-ii and b-i. And I'd also lean towards b-i with the caveat that you are raising. Another issue is that it would be great to support reading null terminated strings. I.e. rather than reading a particular size (binary or decoded size), being able to read until a null terminator is consumed. That seems like something that is likely to come up. I do agree that having a single stream type which represents both binary streams and text streams does make things more painful. Specifically it makes the b-i solution above more painful. However having separate types for binary and text streams also creates problems. Specifically it makes it a lot harder to parse a data format which contains combination of text and binary data. I don't see a good solution to supporting that without bringing in worse issues than the b-i issue above. So I'm personally still leaning towards sticking both binary and text support on the same Stream interface. And then using b-i above. Reading until null termination can probably wait for now. But I'd be interested to hear counter proposals. / Jonas
Re: Overlap between StreamReader and FileReader
Hey all, I was directed here by Anne helpfully posting to public-script-coord and es-discuss. I would love it if a summary of what proposal is currently under discussion: is it [1]? Or maybe some form of [2]? [1]: https://rawgithub.com/tyoshino/stream/master/streams.html [2]: http://lists.w3.org/Archives/Public/public-webapps/2013AprJun/0727.html
RE: Overlap between StreamReader and FileReader
From: Anne van Kesteren [ann...@annevk.nl] Stream.prototype.readType takes an enumerated string value which is arraybuffer (default) or text. Stream.prototype.read returns a promise fulfilled with the type of value requested. I believe this is somewhat similar to how Node streams have settled. Their API is that you call `stream.setEncoding('utf-8')` and then calling `.read(n)` will return a string of at most n characters. By default, there is no encoding set, and calling `.read(n)` will return n bytes in a buffer. In this way, the encoding is a stateful aspect of the stream itself. I don't think there's a way to get around this, without ending up with dangling half-character bytes hanging around.
Re: Overlap between StreamReader and FileReader
On Wed, Jul 31, 2013 at 5:03 PM, Domenic Denicola dome...@domenicdenicola.com wrote: In this way, the encoding is a stateful aspect of the stream itself. I don't think there's a way to get around this, without ending up with dangling half-character bytes hanging around. It seems though that if you can change the way bytes are consumed while reading a stream you will end up with problematic scenarios. E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating a ByteStream or TextStream in advance would address that. -- http://annevankesteren.nl/
RE: Overlap between StreamReader and FileReader
From: Anne van Kesteren [ann...@annevk.nl] It seems though that if you can change the way bytes are consumed while reading a stream you will end up with problematic scenarios. E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating a ByteStream or TextStream in advance would address that. Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas earlier wanted to be able to do both binary and text in the same stream (did he have a specific use case?), and presumably that motivated Node's design as well. I guess you can just say that if you're in binary mode, you should know what you're doing, and know precisely when is the correct time to switch to string mode. If you switch in the middle of a four-byte sequence, you presumably meant to do so, and deserve to get back the mangled characters that result. To make this work might require some kind of put the bytes back primitive, to avoid a situation where you read too far in binary mode and want to back up a bit before you engage string mode. I guess this is Node.js's [unshift][1]. It would be cool to avoid all this though and just read either bytes or strings, without allowing switching. (Maybe, feed the byte stream into a string decoder transform, and get back a string stream?) [1]: http://nodejs.org/api/stream.html#stream_readable_unshift_chunk
Re: Overlap between StreamReader and FileReader
On Wed, Jul 31, 2013 at 10:17 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Anne van Kesteren [ann...@annevk.nl] It seems though that if you can change the way bytes are consumed while reading a stream you will end up with problematic scenarios. E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating a ByteStream or TextStream in advance would address that. Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas earlier wanted to be able to do both binary and text in the same stream (did he have a specific use case?), and presumably that motivated Node's design as well. I don't have very concrete use-cases in mind. But basically consumption of any format that contains both textual and binary data. If we don't think the world contains enough such formats to worry about, then maybe my use case isn't strong enough. I think both pdf and various microsoft document formats fall into this category though. I guess you can just say that if you're in binary mode, you should know what you're doing, and know precisely when is the correct time to switch to string mode. If you switch in the middle of a four-byte sequence, you presumably meant to do so, and deserve to get back the mangled characters that result. To make this work might require some kind of put the bytes back primitive, to avoid a situation where you read too far in binary mode and want to back up a bit before you engage string mode. I guess this is Node.js's [unshift][1]. Note that the read too far issue isn't text specific. When consuming any format which uses a terminator (null or any more complicated pattern) you will have to consume in minimal chunks, often byte-by-byte, to make sure you don't go past that terminator. It would be cool to avoid all this though and just read either bytes or strings, without allowing switching. (Maybe, feed the byte stream into a string decoder transform, and get back a string stream?) Being able to convert between text and binary streams do work well when the whole stream is either textual or binary. It's not clear to me how to do it if you are dealing with a stream that contains both. Though I'd be interested to see proposals. / Jonas
Re: Overlap between StreamReader and FileReader
I read quickly the thread but it seems like this is exactly the issue I had doing [1]. The use case was just decoding utf-8 html chunked buffers and modifying the content on the fly to stream it somewhere else. It had to work inside browsers and with node (which as far as I know does not handle correctly this case, but I did not check latest evolutions) The solution was [2], TextEncoder/Decoder with a super usefull streaming option. [1] https://www.github.com/Ayms/node-Tor [2] http://code.google.com/p/stringencoding/ Regards Aymeric Le 31/07/2013 21:20, Jonas Sicking a écrit : On Wed, Jul 31, 2013 at 10:17 AM, Domenic Denicola dome...@domenicdenicola.com wrote: From: Anne van Kesteren [ann...@annevk.nl] It seems though that if you can change the way bytes are consumed while reading a stream you will end up with problematic scenarios. E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating a ByteStream or TextStream in advance would address that. Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas earlier wanted to be able to do both binary and text in the same stream (did he have a specific use case?), and presumably that motivated Node's design as well. I don't have very concrete use-cases in mind. But basically consumption of any format that contains both textual and binary data. If we don't think the world contains enough such formats to worry about, then maybe my use case isn't strong enough. I think both pdf and various microsoft document formats fall into this category though. I guess you can just say that if you're in binary mode, you should know what you're doing, and know precisely when is the correct time to switch to string mode. If you switch in the middle of a four-byte sequence, you presumably meant to do so, and deserve to get back the mangled characters that result. To make this work might require some kind of put the bytes back primitive, to avoid a situation where you read too far in binary mode and want to back up a bit before you engage string mode. I guess this is Node.js's [unshift][1]. Note that the read too far issue isn't text specific. When consuming any format which uses a terminator (null or any more complicated pattern) you will have to consume in minimal chunks, often byte-by-byte, to make sure you don't go past that terminator. It would be cool to avoid all this though and just read either bytes or strings, without allowing switching. (Maybe, feed the byte stream into a string decoder transform, and get back a string stream?) Being able to convert between text and binary streams do work well when the whole stream is either textual or binary. It's not clear to me how to do it if you are dealing with a stream that contains both. Though I'd be interested to see proposals. / Jonas -- jCore Email : avi...@jcore.fr iAnonym : http://www.ianonym.com node-Tor : https://www.github.com/Ayms/node-Tor GitHub : https://www.github.com/Ayms Web :www.jcore.fr Extract Widget Mobile : www.extractwidget.com BlimpMe! : www.blimpme.com
Re: Overlap between StreamReader and FileReader
Couldn't we simply let the Stream class have a content type, which could contain an encoding. Then when stream.readText is called, if there's an explicit encoding, it would use that encoding when converting to text. / Jonas On Mon, Jul 29, 2013 at 6:38 AM, Takeshi Yoshino tyosh...@google.com wrote: On Thu, Jul 18, 2013 at 7:22 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Jul 17, 2013 at 11:46 AM, Anne van Kesteren ann...@annevk.nl wrote: On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc wrote: What do you mean by such features? Are you saying that a Stream zip decompressor should be responsible for both decompressing as well as binary-text conversion? And thus output something other than a Stream? I meant that for specialized processing you'd likely want more than just decoding. You mentioned HTML parsing which requires a fair amount more. I don't think you want a HTML parser to do both decoding and parsing. That would result in a lot of code duplication in each component that are dealing with textual formats. And if it's just decoding, we could extend TextEncoder/TextDecoder to work with Stream objects. Sure, we can do that. The question is, what is the output from the TextDecoder if you pass it a Stream? A new TextStream type? Is that really better than adding the text-consuming functions to Stream? We could introduce interfaces TextStream (readAsText) and BinaryStream (readAsArrayBuffer) just representing as what type data can be consumed from it. But for convenience, I'd like to have output of XHR to have both. Stream should carry raw binary, charset and MIME, and present them in convenient form (methods) to user. We can define convenience classes like this more generally. - TextStreamWithOptinalTextEncoder - BinaryStreamWithOptionalTextDecoder There's either raw binary or text data behind it and does decoding/encoding when necessary. What we're currently calling Stream and going to use for XHR is BinaryStreamWithOptionalTextDecoder. TextEncoder may be defined to accept TextStream and output a BinaryStream. TextDecoder may be defined to accept BinaryStream and output a TextStream.
Re: Overlap between StreamReader and FileReader
On Mon, Jul 29, 2013 at 1:16 PM, Jonas Sicking jo...@sicking.cc wrote: Couldn't we simply let the Stream class have a content type, which could contain an encoding. Then when stream.readText is called, if there's an explicit encoding, it would use that encoding when converting to text. How about we use what XMLHttpRequest and WebSocket have? Stream.prototype.readType takes an enumerated string value which is arraybuffer (default) or text. Stream.prototype.read returns a promise fulfilled with the type of value requested. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote: On Mon, Jul 29, 2013 at 1:16 PM, Jonas Sicking jo...@sicking.cc wrote: Couldn't we simply let the Stream class have a content type, which could contain an encoding. Then when stream.readText is called, if there's an explicit encoding, it would use that encoding when converting to text. How about we use what XMLHttpRequest and WebSocket have? Stream.prototype.readType takes an enumerated string value which is arraybuffer (default) or text. Stream.prototype.read returns a promise fulfilled with the type of value requested. I'm not sure that comparisons with XHR really works since XHR.responseType affects the parsing behavior, not the decoding behavior. And with WebSocket what you control isn't the result of an operation, but rather the contents of future events. So additional arguments or separate signatures isn't really an option there. I still think that your proposal works. But I don't quite see the advantage of it. Seems like it simply breaks out one of the arguments from the read function and passes it through state. Is the problem you are trying to solve having shorter function names? / Jonas
Re: Overlap between StreamReader and FileReader
On Mon, Jul 29, 2013 at 4:13 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote: How about we use what XMLHttpRequest and WebSocket have? Stream.prototype.readType takes an enumerated string value which is arraybuffer (default) or text. Stream.prototype.read returns a promise fulfilled with the type of value requested. I'm not sure that comparisons with XHR really works since XHR.responseType affects the parsing behavior, not the decoding behavior. And with WebSocket what you control isn't the result of an operation, but rather the contents of future events. So additional arguments or separate signatures isn't really an option there. I still think that your proposal works. But I don't quite see the advantage of it. Seems like it simply breaks out one of the arguments from the read function and passes it through state. Is the problem you are trying to solve having shorter function names? I'm not a big fan of having mutually exclusive accessors for data and passing it as an argument could work too, but given that you want to read multiple times that does not seem super convenient. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Mon, Jul 29, 2013 at 5:37 PM, Anne van Kesteren ann...@annevk.nl wrote: On Mon, Jul 29, 2013 at 4:13 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Jul 29, 2013 at 3:20 PM, Anne van Kesteren ann...@annevk.nl wrote: How about we use what XMLHttpRequest and WebSocket have? Stream.prototype.readType takes an enumerated string value which is arraybuffer (default) or text. Stream.prototype.read returns a promise fulfilled with the type of value requested. I'm not sure that comparisons with XHR really works since XHR.responseType affects the parsing behavior, not the decoding behavior. And with WebSocket what you control isn't the result of an operation, but rather the contents of future events. So additional arguments or separate signatures isn't really an option there. I still think that your proposal works. But I don't quite see the advantage of it. Seems like it simply breaks out one of the arguments from the read function and passes it through state. Is the problem you are trying to solve having shorter function names? I'm not a big fan of having mutually exclusive accessors for data and passing it as an argument could work too, but given that you want to read multiple times that does not seem super convenient. I'm not sure that there's anything mutually exclusive here? Other than that data that .read has consumed can't be consumed by .readText. But that's an effect of that .read/.readText throws away the data after it has been consumed, rather than an effect of that we have two different ways of consuming data. I.e. .readText is as exclusive to .read, as .read is to itself. / Jonas
Re: Overlap between StreamReader and FileReader
On Jul 29, 2013 7:53 PM, Takeshi Yoshino tyosh...@google.com wrote: On Tue, Jul 30, 2013 at 5:16 AM, Jonas Sicking jo...@sicking.cc wrote: Couldn't we simply let the Stream class have a content type, which That's what I meant. In Feras's proposal Stream has type attribute. I copied it to my draft. read(As)Text would use it. Sounds good to me. could contain an encoding. Then when stream.readText is called, if there's an explicit encoding, it would use that encoding when Do you think it should also have overrideMimeType like XHR? I think that use case is rare enough that we can solve it by letting the author create a new Stream object, which presumably would allow specifying a type for that stream, and then feed that new stream the contents of the old stream. / Jonas
Re: Overlap between StreamReader and FileReader
On Wed, Jul 10, 2013 at 7:02 AM, Anne van Kesteren ann...@annevk.nl wrote: On Tue, Jul 2, 2013 at 12:21 AM, Takeshi Yoshino tyosh...@google.com wrote: What I have in my mind is like this: if (this.readyState == this.LOADING) { stream = xhr.response; // XHR has already written some data x0 to stream stream.read().progress(progressHandler); } ...loop... // XHR writes data x1 to stream // XHR writes data x2 to stream // XHR finishes writing to stream progressHandler continues receiving data till EOF. For this read() call without maxSize, all of x0, x1 and x2 will be passed to progressHandler. I see. I kinda thought that if you omitted size it would just give you everything in stream's buffer and not everything until end-of-stream. If you just want to be notified about data as it comes in you can use read*Chunked() in my proposal. Polling data from the buffer seems less useful. Do we even need that? It seems just passing ArrayBuffer in and out could be sufficient for now? As one of read()'s arguments? As for what it would return. Or do we have use cases where decoding to strings and/or Blobs are important? Reading any format that contains textual data. I.e. things like HTML, OpenDocument, pdf, etc. While many of those are compressed, it seems likely that you could pass a stream through a decompressor which produces a decompressed stream. / Jonas
Re: Overlap between StreamReader and FileReader
On Tue, Jul 16, 2013 at 11:10 PM, Jonas Sicking jo...@sicking.cc wrote: Reading any format that contains textual data. I.e. things like HTML, OpenDocument, pdf, etc. While many of those are compressed, it seems likely that you could pass a stream through a decompressor which produces a decompressed stream. Yeah, extending APIs for such features to support streams seems better than adding support for all of them on Stream. Letting Stream just be a low-level primitive for a stream of bytes seems good enough. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Wed, Jul 17, 2013 at 10:47 AM, Anne van Kesteren ann...@annevk.nl wrote: On Tue, Jul 16, 2013 at 11:10 PM, Jonas Sicking jo...@sicking.cc wrote: Reading any format that contains textual data. I.e. things like HTML, OpenDocument, pdf, etc. While many of those are compressed, it seems likely that you could pass a stream through a decompressor which produces a decompressed stream. Yeah, extending APIs for such features to support streams seems better than adding support for all of them on Stream. Letting Stream just be a low-level primitive for a stream of bytes seems good enough. What do you mean by such features? Are you saying that a Stream zip decompressor should be responsible for both decompressing as well as binary-text conversion? And thus output something other than a Stream? / Jonas
Re: Overlap between StreamReader and FileReader
On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc wrote: What do you mean by such features? Are you saying that a Stream zip decompressor should be responsible for both decompressing as well as binary-text conversion? And thus output something other than a Stream? I meant that for specialized processing you'd likely want more than just decoding. You mentioned HTML parsing which requires a fair amount more. And if it's just decoding, we could extend TextEncoder/TextDecoder to work with Stream objects. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Wed, Jul 17, 2013 at 11:46 AM, Anne van Kesteren ann...@annevk.nl wrote: On Wed, Jul 17, 2013 at 11:05 AM, Jonas Sicking jo...@sicking.cc wrote: What do you mean by such features? Are you saying that a Stream zip decompressor should be responsible for both decompressing as well as binary-text conversion? And thus output something other than a Stream? I meant that for specialized processing you'd likely want more than just decoding. You mentioned HTML parsing which requires a fair amount more. I don't think you want a HTML parser to do both decoding and parsing. That would result in a lot of code duplication in each component that are dealing with textual formats. And if it's just decoding, we could extend TextEncoder/TextDecoder to work with Stream objects. Sure, we can do that. The question is, what is the output from the TextDecoder if you pass it a Stream? A new TextStream type? Is that really better than adding the text-consuming functions to Stream? / Jonas
Re: Overlap between StreamReader and FileReader
On Tue, Jul 2, 2013 at 12:21 AM, Takeshi Yoshino tyosh...@google.com wrote: What I have in my mind is like this: if (this.readyState == this.LOADING) { stream = xhr.response; // XHR has already written some data x0 to stream stream.read().progress(progressHandler); } ...loop... // XHR writes data x1 to stream // XHR writes data x2 to stream // XHR finishes writing to stream progressHandler continues receiving data till EOF. For this read() call without maxSize, all of x0, x1 and x2 will be passed to progressHandler. I see. I kinda thought that if you omitted size it would just give you everything in stream's buffer and not everything until end-of-stream. Do we even need that? It seems just passing ArrayBuffer in and out could be sufficient for now? As one of read()'s arguments? As for what it would return. Or do we have use cases where decoding to strings and/or Blobs are important? What's pending read resolvers? When any error occurs the stream needs to pending promises. So, I prepared that list but I haven't written any text for error handling yet. Okay. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Mon, Jul 1, 2013 at 9:03 AM, Takeshi Yoshino tyosh...@google.com wrote: Moved to github. https://github.com/tyoshino/stream/blob/master/streams.html http://htmlpreview.github.io/?https://github.com/tyoshino/stream/blob/master/streams.html Why would it be neutered if size is not given? When size is not given, we need to mark it fully read by using something else. I changed to use read position == -1. I'm not sure I follow. Isn't the maxSize argument optional so you can read all the data queued up thus far? It seems that should just work and not prevent more data queued in the future to be read from the stream. (Later on in the algorithm it seems this is acknowledged, but at that point the stream is already neutered.) I think you need to define the stream buffer somewhat more explicitly so that only what you decide to read from the buffer ends up in the ArrayBuffer and newly queued data while that is happening is not. Do you want FIFO model to be empathized? It doesn't emphasis, it just needs to be clear. Probably defining Stream conceptually and defining read() (I don't think we should call it readAsArrayBuffer) in terms of those concepts You mean that something similar to XHR's responseType is preferred? Do we even need that? It seems just passing ArrayBuffer in and out could be sufficient for now? What's pending read resolvers? -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Wed, Jun 26, 2013 at 6:48 AM, Takeshi Yoshino tyosh...@google.com wrote: I wrote a strawman spec for Stream.readAsArrayBuffer. Comment please. Calling the stream associated concepts the same as the variables in the algorithm is somewhat confusing (read_position vs read_position). 4. If called with the optional size, set the read_position of stream to read_position + size. 5. Otherwise, neuter the stream. Why would it be neutered if size is not given? 7. Read data from stream from read_position up to size bytes or all data is size is not specified. 8. As data from the stream becomes available, do the following, I think you need to define the stream buffer somewhat more explicitly so that only what you decide to read from the buffer ends up in the ArrayBuffer and newly queued data while that is happening is not. Probably defining Stream conceptually and defining read() (I don't think we should call it readAsArrayBuffer) in terms of those concepts is better. E.g. similar to how http://url.spec.whatwg.org/ has a model and an API part. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Sat, May 18, 2013 at 1:38 PM, Jonas Sicking jo...@sicking.cc wrote: For File reading I would now instead do something like partial interface Blob { AbortableProgressFutureArrayBuffer readBinary(BlobReadParams); AbortableProgressFutureDOMString readText(BlobReadTextParams); Stream readStream(BlobReadParams); I'd name it asStream. readStream operation here isn't intended to do any read, i.e. moving data between buffers, (like ArrayBufferView for ArrayBuffer) right? Or it's gonna clone the Blob's contents and wrap with the Stream interface as we cannot discard contents of a Blob and it'll be inconsistent with the semantics (implication?) we're going to give to the Stream interface?
Re: Overlap between StreamReader and FileReader
On Sat, May 18, 2013 at 1:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, May 17, 2013 at 9:38 PM, Jonas Sicking jo...@sicking.cc wrote: For Stream reading, I think I would do something like the following: interface Stream { AbortableProgressFutureArrayBuffer readBinary(optional unsigned long long size); AbortableProgressFutureString readText(optional unsigned long long size, optional DOMString encoding); AbortableProgressFutureBlob readBlob(optional unsigned long long size); ChunkedData readBinaryChunked(optional unsigned long long size); ChunkedData readTextChunked(optional unsigned long long size); }; interface ChunkedData : EventTarget { attribute EventHandler ondata; attribute EventHandler onload; attribute EventHandler onerror; }; Actually, we could even get rid of the ChunkedData interface and do something like interface Stream { AbortableProgressFutureArrayBuffer readBinary(optional unsigned long long size); AbortableProgressFutureString readText(optional unsigned long long size, optional DOMString encoding); AbortableProgressFutureBlob readBlob(optional unsigned long long size); AbortableProgressFuturevoid readBinaryChunked(optional unsigned long long size); AbortableProgressFuturevoid readTextChunked(optional unsigned long long size); }; where the ProgressFutures returned from readBinaryChunked/readBinaryChunked delivers the data in the progress notifications only, and no data is delivered when the future is actually resolved. Though this might be abusing Futures a bit? This is also clear read-only-once interface as well as onmessage() approach because there's no attribute to accumulate the result value. The fact that the argument for accept callback is void strikes at least me that the value passed to progress callback is not an accumulated result but each chunk separately. As the state transition of Stream would be simple enough to match Future, I think technically it's ok and even better to employ it than readyState + callback approach. But is everyone fine with making it mandatory to get used to programming with Future to use Stream?
Re: Overlap between StreamReader and FileReader
On Sat, May 18, 2013 at 5:56 AM, Jonas Sicking jo...@sicking.cc wrote: where the ProgressFutures returned from readBinaryChunked/readBinaryChunked delivers the data in the progress notifications only, and no data is delivered when the future is actually resolved. Though this might be abusing Futures a bit? Yeah, futures represent a value. This is an event stream (that does not keep track of history). -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Sat, May 18, 2013 at 7:36 AM, Anne van Kesteren ann...@annevk.nl wrote: On Sat, May 18, 2013 at 5:56 AM, Jonas Sicking jo...@sicking.cc wrote: where the ProgressFutures returned from readBinaryChunked/readBinaryChunked delivers the data in the progress notifications only, and no data is delivered when the future is actually resolved. Though this might be abusing Futures a bit? Yeah, futures represent a value. This is an event stream (that does not keep track of history). It's not exactly an event stream since the exact events isn't what matters here. I.e. you'll get different events in different implementations, and there are no guarantees that the events themselves will be meaningful. But yeah, I agree it's not representing a value and so it's an abuse of Future's semantics. / Jonas
Re: Overlap between StreamReader and FileReader
On Thu, May 16, 2013 at 10:14 PM, Takeshi Yoshino tyosh...@google.com wrote: I skimmed the thread before starting this and saw that you were pointing out some issues but didn't think you're opposing so much. Well yes. I removed integration from XMLHttpRequest a while back too. Let me check requirements. d) The I/O API needs to work with synchronous XHR. I'm not sure this is a requirement. In particular in light of http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/ and synchronous being worker-only it's not entirely clear to me this needs to be a requirement from the get-go. e) Resource for already processed data should be able to be released explicitly as the user instructs. Can't this happen transparently? g) The I/O API should allow for skipping unnecessary data without creating a new object for that. This would be equivalent to reading and discarding? Not requirement h) Some people wanted Stream to behave like not an object to store the data but kinda dam put between response attribute and XHR's internal buffer (and network stack) expecting that XHR doesn't consume data from the network until read operation is invoked on Stream object. (i.e. Stream controls data flow in addition to callback invocation timing). But it's no longer considered to be a requirement. I'm not sure what this means. It sounds like something that indeed should be transparent from an API point-of-view, but it's hard to tell. We also need to decide whether a stream supports multiple readers or whether you need to explicitly clone a stream somehow. And as far as the API goes, we should study existing libraries. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Thu, May 16, 2013 at 8:26 PM, Feras Moussa feras.mou...@hotmail.com wrote: Can you please go into a bit more detail? I've read through the thread, and it mostly focuses on the details of how a Stream is received from XHR and what behaviors can be expected - it only lightly touches on how you can operate on a stream after it is received. The main problem is that Stream per Streams API is not what you expect from an IO stream, but it's more what Blob should've been (Blob without synchronous size). What we want I think is a real IO stream. If we also need Blob without synchronous size is less clear to me. I do agree the API should allow for scenarios where data can be discarded, given that is an advantage of a Stream over a Blob. It does not seem to do that currently though. It's also not clear to me we want to allow multiple readers by default. That said, Anne, what is your suggestion for how Streams can be consumed? I don't have one yet. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
Sorry, I just took over this work and so was misunderstanding some point in the Streams API spec. On Fri, May 17, 2013 at 6:09 PM, Anne van Kesteren ann...@annevk.nl wrote: On Thu, May 16, 2013 at 10:14 PM, Takeshi Yoshino tyosh...@google.com wrote: I skimmed the thread before starting this and saw that you were pointing out some issues but didn't think you're opposing so much. Well yes. I removed integration from XMLHttpRequest a while back too. Let me check requirements. d) The I/O API needs to work with synchronous XHR. I'm not sure this is a requirement. In particular in light of http://infrequently.org/2013/05/the-case-against-synchronous-worker-apis-2/ and synchronous being worker-only it's not entirely clear to me this needs to be a requirement from the get-go. e) Resource for already processed data should be able to be released explicitly as the user instructs. Can't this happen transparently? Yes. Read data is automatically released model is simple and good. I thought the spec is clear about this but sorry it isn't. In the spec we should say that StreamReader invalidates consumed data in Stream and buffer for the invalidated bytes will be released at that point. Right? g) The I/O API should allow for skipping unnecessary data without creating a new object for that. This would be equivalent to reading and discarding? I wanted to understand clearly what you meant by discard in your posts. I wondered if you were suggesting that we have some method to skip incoming data without creating any object holding received data. I.e. something like s.skip(10); s.readFrom(10); not like var useless_data_at_head_remaining = 256; ondata = function(evt) { var bytes_received = evt.data.size(); if (useless_data_at_head_remaining bytes_received) { useless_data_at_head_remaining -= bytes_received; return; } processUsefulData(evt.data.slice(uselesss_data_at_head_remaining)); } If you meant the latter, I'm ok. I'd also call the latter reading and discarding. Not requirement h) Some people wanted Stream to behave like not an object to store the data but kinda dam put between response attribute and XHR's internal buffer (and network stack) expecting that XHR doesn't consume data from the network until read operation is invoked on Stream object. (i.e. Stream controls data flow in addition to callback invocation timing). But it's no longer considered to be a requirement. I'm not sure what this means. It sounds like something that indeed should be transparent from an API point-of-view, but it's hard to tell. In the thread, Glenn was discussing what's consumer and what's producer, IIRC. I supposed that the idea behind Stream is providing a flow control interface to control XHR has internal buffer. When the internal buffer is full, it stops reading data from the network (e.g. BSD socket). The buffer will be drained when and only when read operation is made on the Stream object. Stream has infinite length, but shouldn't have infinite capacity. It'll swell up if the consumer (e.g. media stream?) is slow. Of course, browsers would set some limit, but it should rather be well discussed in the spec. Unless the limit is visible to scripts, they cannot know if it can watch only load event or need to handle progress event and consume arrived data progressively to process all data. We also need to decide whether a stream supports multiple readers or whether you need to explicitly clone a stream somehow. And as far as the API goes, we should study existing libraries. What use cases do you have in your mind? Your example in the thread was passing one to video but also accessing it manually using StreamReader. I think it's unknown in what timing and how much video consumes data from the Stream to the script and it's really hard make such coordination successful. Are you thinking of use case like mixing chat data and video contents in the same HTTP response body?
Re: Overlap between StreamReader and FileReader
On Fri, May 17, 2013 at 12:09 PM, Takeshi Yoshino tyosh...@google.com wrote: I thought the spec is clear about this but sorry it isn't. In the spec we should say that StreamReader invalidates consumed data in Stream and buffer for the invalidated bytes will be released at that point. Right? I'm glad we're all getting on the same page now. I think there might be use cases for a Blob without size (i.e. where you do not discard the data after consuming) which is what Stream seems to be today, but I'm not sure we should call that Stream. And I think for XMLHttpRequest at least we want an API where data can be discarded once processed so you do not have to keep multi-megabyte sound files on disk if all you want is to provide a (potentially post-processed) live stream. I wanted to understand clearly what you meant by discard in your posts. I wondered if you were suggesting that we have some method to skip incoming data without creating any object holding received data. I.e. something like s.skip(10); s.readFrom(10); not like var useless_data_at_head_remaining = 256; ondata = function(evt) { var bytes_received = evt.data.size(); if (useless_data_at_head_remaining bytes_received) { useless_data_at_head_remaining -= bytes_received; return; } processUsefulData(evt.data.slice(uselesss_data_at_head_remaining)); } If you meant the latter, I'm ok. I'd also call the latter reading and discarding. Yeah that seems about right. What use cases do you have in your mind? Your example in the thread was passing one to video but also accessing it manually using StreamReader. I think it's unknown in what timing and how much video consumes data from the Stream to the script and it's really hard make such coordination successful. Are you thinking of use case like mixing chat data and video contents in the same HTTP response body? I haven't really thought about what I'd use it for, but I looked at e.g. Dart and it seems to have a concept of broadcasted streams. Maybe analyze the incoming bits in one function and in another you'd process the incoming data and do something with it. Above all though it needs to be clear what happens and for IO streams where you do not want to keep all the data around (e.g. unlike the current Streams API) it's a question that needs answering. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
On Fri, May 17, 2013 at 6:15 PM, Anne van Kesteren ann...@annevk.nl wrote: The main problem is that Stream per Streams API is not what you expect from an IO stream, but it's more what Blob should've been (Blob without synchronous size). What we want I think is a real IO stream. If we also need Blob without synchronous size is less clear to me. Forgetting File API completely, for example, ... how about simple socket like interface? // Downloading big data var remaining; var type = null; var payload = ''; function processData(data) { var offset = 0; while (offset data.length) { if (!type) { var type = data.substr(offset, 1); remaining = payloadSize(type); } else if (remaining 0) { var consume = Math.min(remaining, data.length - offset); payload += data.substr(offset, consume); offset += consume; } else if (remaining == 0) { if (type == FOO) { foo(payload); } else if (type == BAR) { bar(payload); } type = null; } } } var client = new XMLHttpRequest(); client.onreadystatechange = function() { if (this.readyState == this.LOADING) { var responseStream = this.response; responseStream.setBufferSize(1024); responseStream.ondata = function(evt) { processData(evt.data); // Consumed data will be invalidated and memory used for the data will be released. }; responseStream.onclose = function() { // Reached end of response body ... }; responseStream.start(); // Now responseStream starts forwarding events happen on XHR to its callbacks. } }; client.open(GET, /foobar); client.responseType = stream; client.send(); // Uploading big data var client = new XMLHttpRequest(); client.open(POST, /foobar); var requestStream = new WriteStream(1024); var producer = new Producer(); producer.ondata = function(evt.data) { requestStream.send(evt.data); }; producer.onclose = function() { requestStream.close(); }; client.send(requestStream);
Re: Overlap between StreamReader and FileReader
I figured I should chime in with some ideas of my own because, well, why not :-) First off, I definitely think the semantic model of a Stream shouldn't be a Blob without a size, but rather a Blob without a size that you can only read from once. I.e. the implementation should be able to discard data as it passes it to a reader. That said, many Stream APIs support the concept of a T. This enables splitting a Stream into two Streams. This enables having multiple consumers of the same data source. However when a T is created, it only returns the data that has so far been unread from the original Stream. It does not return the data from the beginning of the stream since that would prevent streams from discarding data as soon as it has been read. If we are going to have a StreamReader API, then I don't think we should model it after FileReader. FileReader unfortunately followed the model of XMLHttpRequest (based on request from several developers), however this is a pretty terrible API, and I believe we can do much better. And obviously we should do something based on Futures :-) For File reading I would now instead do something like partial interface Blob { AbortableProgressFutureArrayBuffer readBinary(BlobReadParams); AbortableProgressFutureDOMString readText(BlobReadTextParams); Stream readStream(BlobReadParams); }; dictionary BlobReadParams { long long start; long long length; }; dictionary BlobReadTextParams : BlobReadParams { DOMString encoding; }; For Stream reading, I think I would do something like the following: interface Stream { AbortableProgressFutureArrayBuffer readBinary(optional unsigned long long size); AbortableProgressFutureString readText(optional unsigned long long size, optional DOMString encoding); AbortableProgressFutureBlob readBlob(optional unsigned long long size); ChunkedData readBinaryChunked(optional unsigned long long size); ChunkedData readTextChunked(optional unsigned long long size); }; interface ChunkedData : EventTarget { attribute EventHandler ondata; attribute EventHandler onload; attribute EventHandler onerror; }; For all of the above function, if a size is not passed, the rest of the Stream is read. The ChunkedData interface allows incremental reading of a stream. I.e. as soon as there is data available a data event is fired on the ChunkedData object which contains the data since last data event fired. Once we've reached the end of the stream, or the requested size, the load event is fired on the ChunkedData object. So the read* functions allow a consumer to pull data, whereas the read*Chunked allow consumers to have the data pushed at them. There's also other potential functions we can add which allow hybrids, but that seems overly complex for now. Other functions we could add is peekText and peekBinary which allows looking into the stream to determine if you're able to consume the data that's there, or if you should pass the Stream to some other consumer. We might also want to add a eof flag to the Stream interface, as well as an event which is fired when the end of the stream is reached (or should that be modeled using a Future?) / Jonas On Fri, May 17, 2013 at 5:02 AM, Takeshi Yoshino tyosh...@google.com wrote: On Fri, May 17, 2013 at 6:15 PM, Anne van Kesteren ann...@annevk.nl wrote: The main problem is that Stream per Streams API is not what you expect from an IO stream, but it's more what Blob should've been (Blob without synchronous size). What we want I think is a real IO stream. If we also need Blob without synchronous size is less clear to me. Forgetting File API completely, for example, ... how about simple socket like interface? // Downloading big data var remaining; var type = null; var payload = ''; function processData(data) { var offset = 0; while (offset data.length) { if (!type) { var type = data.substr(offset, 1); remaining = payloadSize(type); } else if (remaining 0) { var consume = Math.min(remaining, data.length - offset); payload += data.substr(offset, consume); offset += consume; } else if (remaining == 0) { if (type == FOO) { foo(payload); } else if (type == BAR) { bar(payload); } type = null; } } } var client = new XMLHttpRequest(); client.onreadystatechange = function() { if (this.readyState == this.LOADING) { var responseStream = this.response; responseStream.setBufferSize(1024); responseStream.ondata = function(evt) { processData(evt.data); // Consumed data will be invalidated and memory used for the data will be released. }; responseStream.onclose = function() { // Reached end of response body ... }; responseStream.start(); // Now responseStream starts forwarding events happen on XHR to its callbacks. } }; client.open(GET, /foobar); client.responseType = stream; client.send(); //
Re: Overlap between StreamReader and FileReader
On Fri, May 17, 2013 at 9:38 PM, Jonas Sicking jo...@sicking.cc wrote: For Stream reading, I think I would do something like the following: interface Stream { AbortableProgressFutureArrayBuffer readBinary(optional unsigned long long size); AbortableProgressFutureString readText(optional unsigned long long size, optional DOMString encoding); AbortableProgressFutureBlob readBlob(optional unsigned long long size); ChunkedData readBinaryChunked(optional unsigned long long size); ChunkedData readTextChunked(optional unsigned long long size); }; interface ChunkedData : EventTarget { attribute EventHandler ondata; attribute EventHandler onload; attribute EventHandler onerror; }; Actually, we could even get rid of the ChunkedData interface and do something like interface Stream { AbortableProgressFutureArrayBuffer readBinary(optional unsigned long long size); AbortableProgressFutureString readText(optional unsigned long long size, optional DOMString encoding); AbortableProgressFutureBlob readBlob(optional unsigned long long size); AbortableProgressFuturevoid readBinaryChunked(optional unsigned long long size); AbortableProgressFuturevoid readTextChunked(optional unsigned long long size); }; where the ProgressFutures returned from readBinaryChunked/readBinaryChunked delivers the data in the progress notifications only, and no data is delivered when the future is actually resolved. Though this might be abusing Futures a bit? / Jonas
Re: Overlap between StreamReader and FileReader
On Thu, May 16, 2013 at 5:58 PM, Takeshi Yoshino tyosh...@google.com wrote: StreamReader proposed in the Streams API spec is almost the same as FileReader. By adding the maxSize argument to the readAs methods (new methods or just add it to existing methods as an optional argument) and adding the readAsBlob method, FileReader can cover all what StreamReader provides. Has this already been discussed here? I heard that some people who had this concern discussed briefly and were worrying about derailing File API standardization. We're planning to implement it on Chromium/Blink shortly. The Streams API https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm is no good as far as I can tell. We need something else for IO. (See various threads on this list by me.) Alex will tell you the same so I doubt it'd get through Blink API review. -- http://annevankesteren.nl/
RE: Overlap between StreamReader and FileReader
From: annevankeste...@gmail.com [mailto:annevankeste...@gmail.com] On Thu, May 16, 2013 at 5:58 PM, Takeshi Yoshino tyosh...@google.com wrote: StreamReader proposed in the Streams API spec is almost the same as FileReader. By adding the maxSize argument to the readAs methods (new methods or just add it to existing methods as an optional argument) and adding the readAsBlob method, FileReader can cover all what StreamReader provides. Has this already been discussed here? I heard that some people who had this concern discussed briefly and were worrying about derailing File API standardization. We're planning to implement it on Chromium/Blink shortly. The Streams API https://dvcs.w3.org/hg/streams-api/raw-file/tip/Overview.htm is no good as far as I can tell. We need something else for IO. (See various threads on this list by me.) Alex will tell you the same so I doubt it'd get through Blink API review. Since we have Streams implemented to some degree, I'd love to hear suggestions to improve it relative to IO. Anne can you summarize the points you've made on the other various threads?
RE: Overlap between StreamReader and FileReader
From: annevankeste...@gmail.com [mailto:annevankeste...@gmail.com] On Thu, May 16, 2013 at 6:31 PM, Travis Leithead travis.leith...@microsoft.com wrote: Since we have Streams implemented to some degree, I'd love to hear suggestions to improve it relative to IO. Anne can you summarize the points you've made on the other various threads? I recommend reading through http://lists.w3.org/Archives/Public/public- webapps/2013JanMar/thread.html#msg569 Problems: * Too much complexity for being an Blob without synchronous size. * The API is bad. The API for File is bad too, but we cannot change it, this however is new. And I think we really want an IO API that's not about incremental, but can actively discard incoming data once it's processed. Thanks, I'll review the threads and think about this a bit more.
RE: Overlap between StreamReader and FileReader
Can you please go into a bit more detail? I've read through the thread, and it mostly focuses on the details of how a Stream is received from XHR and what behaviors can be expected - it only lightly touches on how you can operate on a stream after it is received. The StreamReader by design mimics the FileReader, in order to give a consistent experience to developers. If we agree the FileReader has some flaws and we want to take an opportunity to address them with StreamReader, or an alternative, then I think that is reasonable. I do agree the API should allow for scenarios where data can be discarded, given that is an advantage of a Stream over a Blob. That said, Anne, what is your suggestion for how Streams can be consumed? Also, apologies for being a bit late to the conversation - I missed the conversations the past month. I'm now hoping to solicit more feedback and update the Streams spec accordingly. Date: Thu, 16 May 2013 18:41:21 +0100 From: ann...@annevk.nl To: travis.leith...@microsoft.com CC: tyosh...@google.com; slightly...@google.com; public-webapps@w3.org Subject: Re: Overlap between StreamReader and FileReader On Thu, May 16, 2013 at 6:31 PM, Travis Leithead travis.leith...@microsoft.com wrote: Since we have Streams implemented to some degree, I'd love to hear suggestions to improve it relative to IO. Anne can you summarize the points you've made on the other various threads? I recommend reading through http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/thread.html#msg569 Problems: * Too much complexity for being an Blob without synchronous size. * The API is bad. The API for File is bad too, but we cannot change it, this however is new. And I think we really want an IO API that's not about incremental, but can actively discard incoming data once it's processed. -- http://annevankesteren.nl/
Re: Overlap between StreamReader and FileReader
I skimmed the thread before starting this and saw that you were pointing out some issues but didn't think you're opposing so much. Let me check requirements. a) We don't want to introduce a completely new object for streaming HTTP read/write, but we'll realize it by adding some extension to XHR. b) The point to connect the I/O API and XHR should be only the send() method argument and xhr.response attribute if possible. c) The semantics (attribute X is valid when state is ..., etc.) should be kept same as other modes. d) The I/O API needs to work with synchronous XHR. e) Resource for already processed data should be able to be released explicitly as the user instructs. f) Reading with maxSize argument (don't read too much). g) The I/O API should allow for skipping unnecessary data without creating a new object for that. Not requirement h) Some people wanted Stream to behave like not an object to store the data but kinda dam put between response attribute and XHR's internal buffer (and network stack) expecting that XHR doesn't consume data from the network until read operation is invoked on Stream object. (i.e. Stream controls data flow in addition to callback invocation timing). But it's no longer considered to be a requirement. i) Reading with size argument (invoke callback only when data of the specified amount is ready. Only data of the specified size at the head of stream is passed to the handler) On Fri, May 17, 2013 at 2:41 AM, Anne van Kesteren ann...@annevk.nl wrote: On Thu, May 16, 2013 at 6:31 PM, Travis Leithead travis.leith...@microsoft.com wrote: Since we have Streams implemented to some degree, I'd love to hear suggestions to improve it relative to IO. Anne can you summarize the points you've made on the other various threads? I recommend reading through http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/thread.html#msg569 Problems: * Too much complexity for being an Blob without synchronous size. * The API is bad. The API for File is bad too, but we cannot change it, this however is new. And I think we really want an IO API that's not about incremental, but can actively discard incoming data once it's processed. -- http://annevankesteren.nl/