Re: [akka-user] Akka Streams Remote Materialization

James Roper Thu, 13 Jul 2017 18:55:39 -0700

I may have some use cases for this, so I've been thinking about what it 
could look like.  Here are my rough thoughts:

The problem with using Akka remoting is that it's not a guaranteed 
transport nor does it provide back pressure - you would have to serialize 
the requset packets, and then implement some form of dropped message 
detection, then implement a replay mechanism, etc - basically, you'd have 
to implement TCP.  That doesn't make sense to do, when you already have 
TCP. So, each remote materialisation should be a TCP connection.

We need some form a framing over the TCP connection. A lightweight binary 
protocol could do it, but an easier and probably good enough solution to 
start with is WebSockets with akka-http (to my knowledge akka-http doesn't 
yet provide client side WebSocket support, but it will some day).

So, to support remote materialization, we supply an Akka extension.  This 
extension opens an akka io or akka-http port that accepts materialization 
requests.  It provides a method of converting a graph of any shape into a 
GraphRef, the GraphRef will include the shape of the graph, the address to 
connect to, and a unique identifier for the graph.  This GraphRef can then 
be sent to other hosts.  Of course, this implementation will be coupled 
with an Akka remoting serializer that will serialize graphs of all shapes 
into GraphRefs, and then into a binary format that can be sent across the 
wire.  The manifest will probably describe whether it's a javadsl or 
scaladsl graph so it knows what type to instantiate on the other end.

The extension will need to track the graphs it creates GraphRefs for, which 
is a potential resource leak, so there will need to be an expiration 
strategy. I'd envision a configurable default, which can be overridden per 
stream with graph attributes.  Available strategies could be expire after a 
given deadline, or expire after first use (probably with a deadline for the 
first use too).  If the same graph was turned to a GraphRef twice, the 
extension would detect this and emit the same identifier.

To then materialize remotely, the extension will offer a method to turn the 
GraphRef back into a graph.  It wouldn't necessarily have to be the same 
*dsl as it came from, it would just have to be the same shape, though the 
Akka serializer would always produce the same *dsl.  The recreated remote 
graph would have attributes set on it to identify it as a remote stream, 
these would be used by the extension if the remote graph were turned back 
into a GraphRef, in that case the original GraphRef would be returned, so 
any connections to it would go back to the original node, and not need to 
be proxied by the current node.  It would also be able to tell if a 
GraphRef was for itself, then it would just return the stored graph 
directly.

When materialized, a new TCP or WebSocket is created to the node where the 
actual graph lives.  If using WebSockets, the graph ID will just be passed 
as part of the URI.  Other information might be passed including the 
expected graph shape etc, which can be validated before the two ends try to 
do anything that doesn't make sense.  An additional lightweight protocol on 
top of the frames is needed - Akka serialization can be used to 
encode/decode messages, but this requires the manifest and serializer id to 
also be sent.  Additionally, for correct error propogation, there needs to 
be some form of typing on frames, so that when the stream terminates with 
an error, that error can be serialized and sent down the stream.

When working with sinks, sources and flows, it's quite simple, just use the 
corresponding up/down channel on the socket, and upstream/downstream 
completion at different times on flows can be handled by half closing 
connections.  But if we want to allow bidi flows, then either we need two 
sockets (which will require some additional request identifier for the 
server to associate them to the same materialization) or the multiplexing 
of channels on a single socket.  Multiplexing of channels on a single 
socket will mean the channels both share the same backpressure, which could 
be a problem.  Another solution to this would be to use http2, which does 
support multiplexing with independent back pressure.  If we do go down the 
http2 path, then we open up the option of multiplexing multiple graph 
materializations down the one TCP connection.

Materialized values are an interesting thing to contend with. I see two 
approaches, one is to always materialize to NotUsed, and only allow streams 
that materialize to NotUsed to be converted to GraphRefs.  Of course, this 
can only be enforced at compile time, which means the Akka remoting 
serializer for this can at best warn/fail at runtime after the stream is 
materialized if it returns anything other than NotUsed.

The other option, which has similar constraints, is to always materialize 
to a Future/CompletionStage (depending on which dsl) of the remote 
materialized value.  If the remote materialized value is a future itself, 
it effectively gets flattened automatically. The materialized value can 
either be sent in the same TCP connection, or in some out of band 
mechanism, eg using Akka remoting.

Anyway, that's everything that I've thought of.  If anyone has some 
comments let me know.  It would certainly be an interesting project to 
implement.

On Saturday, April 9, 2016 at 2:30:45 AM UTC+10, Patrik Nordwall wrote:
>
> No news. We are currently not working on it. You can keep an eye on what 
> is done in Gearpump.
>
> http://www.gearpump.io/
>
>
> http://www.marketwired.com/press-release/lightbend-announces-collaboration-bring-low-latency-high-availability-big-data-2099511.htm
>
> On Wed, Apr 6, 2016 at 11:14 PM, César Aguilera <[email protected] 
> <javascript:>> wrote:
>
>> Hi all,
>>
>> any news concerning this request?
>>
>> Cheers 
>>
>>
>>
>> On Thursday, 28 May 2015 11:16:36 UTC+2, Oliver Winks wrote:
>>>
>>> Thanks for the info. I'm looking forward to future releases of Akka 
>>> Streams.
>>> Cheers
>>>
>>> On Wednesday, 27 May 2015 22:24:44 UTC+1, Akka Team wrote:
>>>>
>>>> Hi Oliver,
>>>> we do not (currently) support distributed materialization of streams.
>>>> The reason is that it will require implementing redelivery for stream 
>>>> messages and a number of related issues which need to be fixed, which has 
>>>> not happened yet.
>>>>
>>>> Currently we are focusing on getting the 1.0 out the door, which means 
>>>> API stability, we also need to work on in-memory performance as it has not 
>>>> yet been a focus,
>>>> and is a critical point for making Akka HTTP as performant as Spray - 
>>>> at which point we'll be happy to recommend using streams in production 
>>>> systems.
>>>> Please remember that 1.0 still means that streams are experimental.
>>>>
>>>> The distributed scenario is a very interesting one, but we do not have 
>>>> enough people/time to throw at that problem currently as other tasks are 
>>>> more urgent.
>>>> Hope this explains things a bit!
>>>>
>>>> -- konrad
>>>>
>>>> On Sat, May 23, 2015 at 3:18 AM, Oliver Winks <[email protected]> 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> The way I understand materialisation in Akka Streams is that the 
>>>>> ActorFlowMaterializer will create a number of actors which are used to 
>>>>> process the flows within a stream. Is it possible to control the number 
>>>>> and 
>>>>> location of actors that get materialised when running a Flow? I'd like to 
>>>>> be able to create remote actors on several machines for processing my 
>>>>> FlowGraph.
>>>>>
>>>>> Thanks,
>>>>> Oli.
>>>>>
>>>>> -- 
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: 
>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>> >>>>>>>>>> Search the archives: 
>>>>> https://groups.google.com/group/akka-user
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Akka User List" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Akka Team
>>>> Typesafe - Reactive apps on the JVM
>>>> Blog: letitcrash.com
>>>> Twitter: @akkateam
>>>>
>>> -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/akka-user.
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
>
> Patrik Nordwall
> Akka Tech Lead
> Lightbend <http://www.lightbend.com/> -  Reactive apps on the JVM
> Twitter: @patriknw
>
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Akka Streams Remote Materialization

Reply via email to