Benjamin Mahler created MESOS-3830:
--------------------------------------

             Summary: Provide a means to do async data transfer with async 
back-pressure.
                 Key: MESOS-3830
                 URL: https://issues.apache.org/jira/browse/MESOS-3830
             Project: Mesos
          Issue Type: Improvement
          Components: libprocess
            Reporter: Benjamin Mahler


I had starting thinking about this while implementing http::Pipe and more 
recently when seeing the docker registry client streaming to a file and the 
fetcher cache refactoring work. This should still be seen as an active thought 
process and this description will be edited along the way.

The overall idea here is to provide a composable abstraction to support 
asynchronously streaming data in libprocess with asynchronous back-pressure. 
The following characteristics are desired:

{noformat}
(1) Ability to express both finite and infinite streams of data.
(2) Asynchronous zero-copy data transfer.
(3) Asynchronous back-pressure.
(4) Support for composition (prefer this to polymorphism seen in other 
implementations):
   (a) Allow data to flow down through a "pipeline" of transformations.
   (b) Allow backpressure to flow back up the "pipeline".
   (c) Allow failures and closures to flow back up the "pipeline".
{noformat}

The existing 
[http::Pipe|https://github.com/apache/mesos/blob/0.25.0/3rdparty/libprocess/include/process/http.hpp#L172]
 is part of the way there (needs enhancements for zero-copy, async 
back-pressure, and composition) and can be pulled up to become process::Pipe.

Composition allows us to create a pipeline of data transfer:

{code}
// Stream the contents of the url to a file.
Future<Nothing> complete = http::download(url) | io::fileWriter(file);

// Signatures.
Pipe::Reader http::download(const Url&);
Pipe::Writer io::writer(const Path&);
{code}

In this example, composition occurs by connecting the read end of pipe1 (from 
http::download) with the write end of pipe2 (from io::fileWriter). This 
composition terminates the pipeline (i.e. A | B | C is not allowed here) and 
returns a future for the caller to detect completion or trigger a discard to 
close down the pipeline.

Data will now flow through the pipeline without copies and backpressure from 
the file writing will slow down the consumption of data from the tcp socket 
downloading the url contents.

Another form of composition occurs when we want to apply a transformation:

{code}
// Stream the contents of the url to a file.
Future<Nothing> complete = http::download(url) | zip | io::fileWriter(file);

// TODO: Figure out 'zip' type signature to enable composition.
{code}

Here (A | B) returns another Pipe::Reader because B is a transformation rather 
than a Pipe::Writer. This enables A | B | C where composing with C terminates 
the pipeline once again.

Related work that is interesting to examine:

NodeJS's Stream: https://nodejs.org/api/stream.html
Reactive Streams: http://www.reactive-streams.org/
Netty's Channels / Java NIO ByteBuffer: 
http://seeallhearall.blogspot.com/2012/05/netty-tutorial-part-1-introduction-to.html

We may also want typed Pipes to capture the process::Stream<T> abstraction 
we've wanted in the past. process::Stream<T> and process:Pipe discussed here 
share much of the same functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to