Re: [akka-user] Feedback on Akka Streams 1.0-M2 Documentation

Endre Varga Fri, 23 Jan 2015 09:24:58 -0800

Hi Sam,

On Fri, Jan 23, 2015 at 12:02 AM, Sam Halliday <[email protected]>
wrote:

> On Wednesday, 21 January 2015 08:21:15 UTC, drewhk wrote:
> >> I believe it may be possible to use the current 1.0-M2 to address
> >> my bugbear by using the Actor integration to write an actor that
> >> has N instances behind a router, but it feels hacky to have to
> >> use an Actor at all. What is really missing is a Junction that
> >> multiplies the next Flow into N parallel parts that run on
> >> separate threads.
> >
> >
> http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M2/scala/stream-cookbook.html#Balancing_jobs_to_a_fixed_pool_of_workers
>
> I actually missed this when reading the docs... it's a gem buried
> in a sea of examples! :-) Is there anything like this in the "top
> down" style documentation?
>

Currently no, the documentation is about explaining the elements, the
cookbook is a list of examples/patterns that can be used directly or
modified, or just as a source of inspiration. But yes, we should link from
the main pages into the relevant cookbook sections.

>
> A convenience method to set up this sort of thing is exactly what
> I mean. I should imagine that fanning out a Flow for
> embarrasingly parallel processing is a common enough pattern that
> one would want to be able to do this in a one liner. You note
> something about work in this area later on (quoted out of order):
>

If you take that recipe you have a one-liner :) Our main philosophy is to
not put overly many combinators prepackaged instead encourage flexible use
of them. It is about giving a fish or teaching how to fish :) The idea is
that if certain patterns seems to be widely used we promote them to be
library provided combinators.

>
> > In the future we will allow users to add explicit markers where
> > the materializer needs to cut up chains/graphs into concurrent
> > entities.
>
> This sounds very good. Is there a ticket I can subscribe to for
> updates? Is there a working name for the materializer so that I
> know what to look out for?
>

Not really, there are multiple tickets. The ActorBasedFlowMaterializer has
an Optimizations parameter which is currently not documented. It will
eagerly collapse entities into synchronous ones as much as possible, but
currently there is no API to add boundaries to this collapse procedure
(e.g. you have two map stages that you *do* want to keep conccurrent and
pipelined). Also it cannot collapse currently graph elements,
stream-of-stream elements, mapAsync and the timed elements.

Also, remember that this is about pipelining which is different from the
parallellization demonstrated in the cookbook.

>
>
> > Also, you can try mapAsync or mapAsyncUnordered for similar
> > tasks.
>
> It would be good to read some discussion about these that goes
> further than the API docs. Do they just create a Future and
> therefore have no way to tell a fast producer to slow down?
>

A mapAsync/mapAsyncUnordered will create these futures in a bounded number
at any time and emit their result once they are completed one-by-one once
the downstream is able to consume them. Once this happened a new Future is
created. So at any given time there are a bounded number of uncompleted
futures. In other words Future completion is the backpressure signal to the
upstream.

> How
> does one achieve pushback from these? Pushback on evaluation of
> the result is essential, not on the ability to create/schedule
> the futures. I would very like to see some documentation that
> explains where this has an advantage over plain 'ol Scala Stream
> with a map{Future{...}}.
>

Doing the above on a Scala Stream will create arbitrarily many Futures, and
it will not wait for the result of those Futures. Our mapAsync on the other
hand waits for the result of the Futures and emits those (i.e. it is a
flattening operation), and only keeps a limited number of open Futures at
any given time.

>
>
> >> In general, I felt that the documentation was missing any
> >> commentary on concurrency and parallelisation. I was left
> >> wondering what thread everything was happening on.
> >
> > ... as the default all stream processing element is backed by
> >  an actor ...
>
> The very fact that each component is backed by an Actor is news
> to me. This wasn't at all obvious from the docs and actually
> the "integration with actors" section made me think that streams
> must be implemented completely differently if it needs an
> integration layer!
>

Internals might or might not be actors. This is all depending on what
materializer is used (currently there is only one kind), and even currently
we have elements not backed by an Actor (CompletedSource and friends for
example). This is completely internal stuff, we don't want to document all
aspects of it. But yes, we can add more high-level info about it.

> Actually, the "actor integration" really
> means "low level streaming actors", doesn't it?
>

I am not sure what this means.

> I would strongly
> recommend making this a lot clearer as it helps to de-mystify the
> implementation.
>
> Now knowing that each vertex in the graph is backed by an actor,
> I wonder if in "balancing work to fixed pools" the Balance is
> just adding a router actor with a balance strategy?
>

No, it is completely different, routers are not backpressure aware so they
cannot be used here. We implemented these elements as actors, but it is not
necessary to do so.

> The
> convenience method I suggested above could simply create a router
> to multiple instances of the same actor with a parallelism bound.
>

Well, that is what the recipe does with the Balance and Merge elements.

> I'm not entirely sure why one would need a Merge strategy for
> that, although the option to preserve input order at the output
> would be good for some circumstances (which could apply pushback
> in preference to growing the queue of out-of-order results).
>

This is why I kept it as a recipe. If you create a strict round robin
Balance and a strict round robin Merge on the other side, the
parallelization step will keep the sequence. But you might want to add
dropping elements, or batching elements, whatnot. The point of the recipe
is that you can tailor it completely to your needs because you see how it
works instead of just calling a doMagicForMe() method :)

>
> In addition, this revelation helps me to understand the
> performance considerations of using akka-streams. Knowing this,
> it would only be appropriate for something I would've considered
> suitable (from a performance point of view) for hand-crafted flow
> control in akka before streams was released. The main advantage
> of akka-streams is therefore that it has dramatically lowered the
> barrier of entry for writing Actor code with flow control.
>

Exactly!

>
>
>
> Thanks for this explanation Endre, I hope to see even more
> innovation in the coming releases and improved documentation.
>

Cheers,
-Endre

>
> Best regards,
> Sam
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Feedback on Akka Streams 1.0-M2 Documentation

Reply via email to