Hi all,

I haven't found a general guide to properly use and implement streams, so here are some of my questions about nodejs stream api (pipe, error handing), and core classes using it. I work on nodejs 0.10.36, but usually checked the validity on nodejs 0.12.0 too.


The use-case is:
http.incomingMessage (from http.server) > some Stream.Transform > Amazon AWS S3 upload

With Stream2, pipes work well, with backpressure, everything is great.
Except for error handling.
Some actions need to be taken when everything is fine and finished (send the HTTP response). It's easy: listen to 'finish' events on the last stream in the pipeline. Some actions need to be taken when an error happen: send an error HTTP response, but also abort the AWS S3 upload (otherwise pending MultiPart upload data is billed indefinitely). There has already been a discussion on this ML for error handling, and the recommended solution seems to be to listen to error events on each stream of the pipe, and do a global cleanup.

However, there is no standard way to abort a stream: it's not part of the Stream2 (or Stream3) API.
Some streams have a 'destroy()' method to force the end of the stream.
- For example net.socket: it emits the 'close' event with a boolean argument to specify if an error caused this 'close' event (in which case an 'error' event was previous emitted). This 'destroy()' method in fact takes an optional error argument that will be emitted as 'error' after the destruction. This is not documented; it should, or it should be removed. - 'through2' (https://github.com/rvagg/through2/blob/0fa42c010aa58b562fd3eabfb53c0bc5ddaa60f7/through2.js#L12-L22) does something similar, although without the boolean argument to 'close'. - 'http.ClientRequest' (which implements Stream.Writable) has no 'destroy()', but has an 'abort()' method. It only calls 'socket.close()' (if any), which (usually) emits 'close', and ClientRequest listens to it and finally emits 'close', and in some cases 'aborted' too (which is not documented).

If there is no direct method to abort a stream (and maybe even if there is), the streams down the pipe from the one that failed need to finish. A solution is to call 'end()' on the first downstream stream (see http://www.bennadel.com/blog/2692-you-have-to-explicitly-end-streams-after-pipes-break-in-node-js.htm). But this doesn't signal the downstream stream that the upstream stream failed, it just signals it ended: all downstream streams should be properly aborted first (in my initial use-case I don't want to write an incomplete data to AWS S3, which is what would happen if we just call .end()).

To sum up, there is no standard way to abort a stream, and when there are, they don't behave the same (it may emit 'error' and/or 'close', and maybe some specific events (like 'aborted')). So it's hard to cleanup a failed pipeline, and to deal with the consequences of the cleanup (the cleanup may emit some events, in a non standardized way).

What is the recommended way do to this?


Related to that, http.incomingMessage emit neither 'error' nor 'end' when the HTTP request finishes but is incomplete. From the Stream.Readable documentation on 'error' and 'end' events, I would expect one or both of 'error' and 'end' events in this case: - the http request is incomplete, this is an error, the request should not be treated as successful: there is missing input data => should emit 'error' - there will be no more data to read from the request object => should emit 'end' In fact this reasoning could be applied for all fatal 'error' events: after them there will be no more data to read => both 'error' then 'end' should be emitted. If we don't want this behavior then the documentation should be changed to reflect this.

To come back to http.incomingMessage: on aborted request, 'aborted' then 'close' events are emitted.
'aborted' is not even documented, it should, or it should be removed.
Without documentation we cannot rely on it, and fallback to listening to 'close' and 'end', and detect the aborted request if 'end' was not emitted before 'close', which may not be robust (as it's implementation dependent), and may even be wrong. ('close' even seems to not be emitted in normal requests, probably because of HTTP keepAlive).

What is the recommended way to deal with incomplete http.incomingMessage ? Shouldn't the current behavior be modified?


Finally, from a Stream.Writable implementer point-of-view, how to deal with errors (internal, or external: from an abort/destroy() call)?
There are multiple ways to signal an error:
- emit the 'error' event
- call pending callbacks with an error (writable._write(chunk, encoding, callback), transform._flush(callback)) Calling the pending callbacks with an error will eventually also emit an 'error' event. But there is not always a pending callback. The 'error' event will notify upstream streams in the pipe, but how to notify downstream too?
Which one should be preferred?
Also, could it be useful to guarantee that 'close' is emitted after 'error' in case of aborts?


Thank you,

Thomas

--
Job board: http://jobs.nodejs.org/
New group rules: 
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
--- You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/nodejs/54F88BEA.50208%40systran.fr.
For more options, visit https://groups.google.com/d/optout.

Reply via email to