Re: [akka-user] Grokking akka-stream and TcpStream

Endre Varga Fri, 13 Feb 2015 03:51:57 -0800

Hi Reid,

On Fri, Feb 13, 2015 at 11:59 AM, Reid Spencer <[email protected]> wrote:


> Hi Eric,
>
> Thanks for your response. It is a little comforting to know that I’m not
> the only one on this planet also struggling with how to do something real
> with akka-stream. I’m sure this thread will eventually lead to the clarity
> we both seek.
>

I probably need to add that one of the beginner issues is to try to do
*everything* with streams. Currently this is doomed to fail, and even in
the future it will be not possible in all cases. Plain old actors are
always there to use, so use streams where they really shine already, right
now: for example a chain of decoding/encoding steps, simple transformations
that can be expressed as an ask to an actor (you can use it together with
mapAsync and have your actor magically backpressured!). Also, graphs are
almost always needed for protocols.


>
> Yes, I’ve seen this before but it led to my grumbling about the lack of an
> internal design document. That Flow.apply method creates a GraphBackedFlow.
>
>

These will be gone in M4. A Flow is just a collection of processing steps
(any graph layout) with exactly one input and output port that is not wired
to anything.


> When I first saw that I was encouraged because I realized I would just
> need to understand GraphBackedFlow to understand what
> Flow.apply(Sink[I],Source[O]) was giving me. But, the documentation for
> that object only says “INTERNAL API”. I’m fine with that lack of
> documentation for things that I don’t use directly, but this
> GraphBackedFlow is something returned by the Flow companion and would be an
> instance of something I use in my program. It needs user level
> documentation. Either that or the Flow.apply method should be much more
> explicit about what is returned.  All it says is "*Create a [[Flow]] from
> a seemingly disconnected [[Source]] and [[Sink]] pair*.” It would be very
> much appreciated if there was at least a line or two of what this
> GraphBackedFlow represented. I get that it backs the flow with some sort of
> graph of objects but that really doesn’t give me much information about its
> operation.
>

These things will go away with M4, see explanation below.


>
>
>    1. This seems to allow you to "handle" a TCP connection with a Flow
>    using independent Sink and Source. I am still in the process of trying to
>    construct a FlowGraph using ActorSubscriber and ActorPublisher to integrate
>    the TCP connection with application logic using actors, but finding it a
>    challenge to do so in a simple way and avoiding circular references. So I'm
>    still working on "grokking" it myself, but perhaps this discovery might
>    help.
>
>
> I will have to dive into the Working With Graphs
> <http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M3/scala/stream-graphs.html>
>  page
> in more detail. I’ve been giving it less attention because of this sentence:
>
> Graphs are needed whenever you want to perform any kind of fan-in
> ("multiple inputs") or fan-out ("multiple outputs") operations.
>
>
Well, I think that sentence really does not convey the importance of
graphs. It is very likely that you will need graphs for more complex
application.

So, just to give you a taste of M4:

 - Internally, everything is just a "Module" which has input and output
ports
 - An atomic Module is directly translated to an executable entity
 - A composite module consists of atomic modules and possible other
composite modues, with some ouput ports connected to some input ports
 - you can hierarchically nest these modules
 - A Flow is *really* nothing more than just a DSL shim over a module that
has exactly one input and output port, but has arbitrarily complex nested
graph layout inside
 - Same goes for Source and Sink

Now you can start imagining your application as a hieararchic setup of
"boxes" (modules) that represent services, and has input and output ports.
You compose simpler services to more complex ones and wrap them in another
box, until you have a nice nested hierarchy.

And that's it :)


>
> I believe my needs are two linear “roads” not a network of junctions that
> fan-in and fan-out; but then again, I could be wrong.
>

This is hard to know. I don't know the exact hassles you need.


>
> Akka Team: Please don’t take any of this commentary as derogatory; I’m
> quite enthusiastic about using akka-stream and StreamTcp.   I understand
> you are scrambling to get this released in a couple of months and there is
> much to do and document. In the spirit of cooperation, I will suggest that
> if you can point Eric and I in the right direction (e.g. how to write an
> asynchronous Tcp protocol), then I will volunteer to add sample code and
> documentation to Akka that explains this part (once I understand it, of
> course!) :)
>

Well, just give a very dummy-sized example problem that you find enough to
start (please keep it minimal, we are short in resources) and I can attempt
to give some pointers.

-Endre


>
>
>    1.
>    -- Eric
>
>
>
> On Thursday, February 12, 2015 at 11:56:03 AM UTC-5, Reid Spencer wrote:
>>
>> hAkkers,
>>
>> I've been unable to *grok <http://en.wikipedia.org/wiki/Grok>* how to
>> communicate with a TCP socket using akka-stream and StreamTcp extension.
>> At this point, I'm not sure the fault is entirely mine. :)  I'm building a 
>> MongoDB
>> driver <https://github.com/reactific/RxMongo> that uses Akka and I have
>> it working well with akka.io. Mongo requires asynchronous reading and
>> writing on a TCP socket. You can write requests to it as they happen and
>> you read responses as they can be satisfied by the server. Requests and
>> responses are matched with an ID number (i.e. each response indicates the
>> request ID to which it responds). This seems to be an ideal candidate for
>> akka-streams, at least on the surface. I'm now trying to transition my
>> design to use akka-streams and StreamTcp. After several days of fumbling
>> around, I'm still not able to grasp how to connect all the pieces. So, I'm
>> hoping the group can help and that this might be instructive for users of
>> akka-stream, or at least shine some light on needed documentation or
>> features.
>>
>> Just to address the obligatory:
>>
>>    - I've read the akka-stream (1.0-M3) documentation, many times, every
>>    page.
>>    - I've looked at the akka-stream code and discovered that without
>>    some sort of internal design document, much of it will be unintelligible
>>    because I don't have a conceptual model for how the pieces fit together
>>    (essentially a forest/trees issue).
>>    - I've read the (insufficient, IMO) API documentation.
>>    - I've built, tried and studied the TcpEcho sample program.
>>
>> That sample TcpEcho program is the source of most of my misunderstanding
>> as it is the only sample that relates to what I'm doing and I cannot
>> extrapolate from it to do what I want to do.  Here's the program, from
>> the documentation
>> <http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M3/scala/stream-io.html#Connecting__REPL_Client>
>> :
>>
>>
>> val connection: OutgoingConnection = StreamTcp().outgoingConnection(
>> localhost)
>>
>> val replParser = new PushStage[String, ByteString] {
>> override def onPush(elem: String, ctx: Context[ByteString]): Directive =
>> {
>> elem match {
>> case "q" ⇒ ctx.pushAndFinish(ByteString("BYE\n"))
>> case _ ⇒ ctx.push(ByteString(s"$elem\n"))
>> }
>> }
>> }
>>
>> val repl = Flow[ByteString]
>> .transform(() => RecipeParseLines.parseLines("\n", maximumLineBytes = 256
>> ))
>> .map(text => println("Server: " + text))
>> .map(_ => readLine("> "))
>> .transform(() ⇒ replParser)
>>
>> connection.handleWith(repl)
>>
>>
>> Here are the things that I find confusing:
>>
>>    - The repl value is defined by invoking Flow[ByteString]. Now, I know
>>    the API well enough to know that Flow requires two type parameters:
>>    Flow[+In,-Out]
>>    
>> <http://doc.akka.io/api/akka-stream-and-http-experimental/1.0-M3/?_ga=1.17864249.1697328887.1413587984#akka.stream.scaladsl.Flow$>
>>    . This is confusing because the Flow companion object's apply method
>>    takes only one type argument which it expands by duplicating. So
>>    Flow[ByteString] actually instantiates a Flow[ByteString,ByteString].
>>    I only note this because it took some digging around in the API before I
>>    understood how this worked and while it is handy, it is also not straight
>>    forward.
>>    - The documentation implies that a Flow is uni-directional
>>    
>> <http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M3/scala/stream-flows-and-basics.html#Defining_and_running_streams>.
>>    It says a Flow "connects its up- and downstreams by transforming the
>>    data elements flowing through it." That, to me, says "unidirectional". The
>>    use of a Flow[ByteString,ByteString] for the repl value indicates to
>>    me that a uni-directional "transformation" from ByteString to ByteString 
>> is
>>     occurring and yet this code implies that it is doing both reading and
>>    writing to the socket (i.e. it is bi-directional). How can that be?
>>    - I see repl as a Flow that does this: takes a ByteString as input,
>>    chunks it into \n terminated lines up to 256 bytes, prints those lines out
>>    prefixed by "Server: " and then discards that input and replaces it with a
>>    line read from the console which is then output with a newline appended
>>    unless the input was "q" in which case it is replaced by "BYE\n" and a
>>    termination signal. Okay, that's all great and it is all unidirectional
>>    writing data to the socket. So, now the questions:
>>       - Where is the reading from the server to get the original line(s)
>>       as input to the flow? I.e. where is the Source[ByteString]?
>>       - Assuming that connection.handleWith(repl)does some magic to set
>>       up the Source, how does the conversation get started? Is it assumed the
>>       server will send some data upon connection? The echo server example
>>       
>> <http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M3/scala/stream-io.html#Accepting_connections__Echo_Server>
>>       seems to have the same issue!
>>       - FYI: OutgoingConnection.handleWith's documentation
>>       
>> <http://doc.akka.io/api/akka-stream-and-http-experimental/1.0-M3/?_ga=1.17864249.1697328887.1413587984#akka.stream.scaladsl.StreamTcp$$OutgoingConnection>
>>       says this:
>>
>> "Handles the connection using the given flow. This method can be called
>>> several times, every call will materialize the given flow exactly once
>>> thereby triggering a new connection attempt to the remoteAddress. If
>>> the connection cannot be established the materialized stream will
>>> immediately be terminated with a akka.stream.StreamTcpException
>>> <http://doc.akka.io/api/akka-stream-and-http-experimental/1.0-M3/akka/stream/StreamTcpException.html>
>>> ."
>>
>>
>>    - It would be nice if there was a little more description around
>>       "Handles the connection". Handles it how? Does it set up a Source and 
>> Sink?
>>       Are asynchronous bi-directional reading from the socket and writing to 
>> the
>>       socket implied?
>>    - Am I to infer from all this that the StreamTcp.outgoingConnection
>>    creates a Source[ByteString] from the socket's input and a
>>    Sink[ByteString] for the socket's output and that the flow provided
>>    to handleWith is run between these two? In other words, the
>>    OutgoingConnection can really only transform its input to its output?
>>    If so, then:
>>       - How is that generally applicable?
>>       - This approach works fine for an echo client, but clearly there
>>       are protocols where the input and output can and should be processed
>>       independently, aren't there?
>>       - How would one do what Mongo needs and have an asynch flow of
>>       requests that is independent of an asynch flow of responses?
>>       - I noticed the Add Bi-directional Flow
>>       <https://github.com/akka/akka/issues/16416> issue that is slated
>>       for inclusion in 1.0-M4. Is this intended for solving this issue where 
>> two
>>       related flows are paired to do bi-directional input/output?
>>       - Am I just trying to implement my mongo driver before the
>>       required features are ready?
>>
>> The documentation says, about this code:
>>
>> A resilient REPL client would be more sophisticated than this, for
>>> example it should split out the input reading into a separate mapAsync step
>>> and have a way to let the server write more data than one ByteString chunk
>>> at any given time, these improvements however are left as exercise for the
>>> reader.
>>
>>
>> I would like a "resilient client" and I think leaving this part as an
>> "exercise for the reader" is asking a bit much from the audience. We need
>> an example of how to do this as it is likely the typical case not the
>> exception (nobody needs another echo server/client). I suspect that the
>> answer to my confusion lies in the information intended but not stated by
>> this sentence from the documentation.
>>
>> Specifically, I do not comprehend how mapAsync (or mapAsyncUnordered)
>> help to split out the input reading because it is NOT obvious to me where
>> this "input reading" is being done! If I used mapAsynch to obtain the
>> request data from my driver's clients, it seems very obtuse to be setting
>> up numerous Futures as opposed to just allowing them to give me a
>> Source[Request] from which their requests are read and processed.
>>
>> Any help you can provide to prevent me from drowning in these waters
>> would be much appreciated!!
>>
>> Thanks,
>>
>> Reid.
>>
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Akka User List" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/akka-user/BQhnmseCyN0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Grokking akka-stream and TcpStream

Reply via email to