Hi Sam, On Fri, Jan 23, 2015 at 12:02 AM, Sam Halliday <[email protected]> wrote:
> On Wednesday, 21 January 2015 08:21:15 UTC, drewhk wrote: > >> I believe it may be possible to use the current 1.0-M2 to address > >> my bugbear by using the Actor integration to write an actor that > >> has N instances behind a router, but it feels hacky to have to > >> use an Actor at all. What is really missing is a Junction that > >> multiplies the next Flow into N parallel parts that run on > >> separate threads. > > > > > http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M2/scala/stream-cookbook.html#Balancing_jobs_to_a_fixed_pool_of_workers > > I actually missed this when reading the docs... it's a gem buried > in a sea of examples! :-) Is there anything like this in the "top > down" style documentation? > Currently no, the documentation is about explaining the elements, the cookbook is a list of examples/patterns that can be used directly or modified, or just as a source of inspiration. But yes, we should link from the main pages into the relevant cookbook sections. > > A convenience method to set up this sort of thing is exactly what > I mean. I should imagine that fanning out a Flow for > embarrasingly parallel processing is a common enough pattern that > one would want to be able to do this in a one liner. You note > something about work in this area later on (quoted out of order): > If you take that recipe you have a one-liner :) Our main philosophy is to not put overly many combinators prepackaged instead encourage flexible use of them. It is about giving a fish or teaching how to fish :) The idea is that if certain patterns seems to be widely used we promote them to be library provided combinators. > > > In the future we will allow users to add explicit markers where > > the materializer needs to cut up chains/graphs into concurrent > > entities. > > This sounds very good. Is there a ticket I can subscribe to for > updates? Is there a working name for the materializer so that I > know what to look out for? > Not really, there are multiple tickets. The ActorBasedFlowMaterializer has an Optimizations parameter which is currently not documented. It will eagerly collapse entities into synchronous ones as much as possible, but currently there is no API to add boundaries to this collapse procedure (e.g. you have two map stages that you *do* want to keep conccurrent and pipelined). Also it cannot collapse currently graph elements, stream-of-stream elements, mapAsync and the timed elements. Also, remember that this is about pipelining which is different from the parallellization demonstrated in the cookbook. > > > > Also, you can try mapAsync or mapAsyncUnordered for similar > > tasks. > > It would be good to read some discussion about these that goes > further than the API docs. Do they just create a Future and > therefore have no way to tell a fast producer to slow down? > A mapAsync/mapAsyncUnordered will create these futures in a bounded number at any time and emit their result once they are completed one-by-one once the downstream is able to consume them. Once this happened a new Future is created. So at any given time there are a bounded number of uncompleted futures. In other words Future completion is the backpressure signal to the upstream. > How > does one achieve pushback from these? Pushback on evaluation of > the result is essential, not on the ability to create/schedule > the futures. I would very like to see some documentation that > explains where this has an advantage over plain 'ol Scala Stream > with a map{Future{...}}. > Doing the above on a Scala Stream will create arbitrarily many Futures, and it will not wait for the result of those Futures. Our mapAsync on the other hand waits for the result of the Futures and emits those (i.e. it is a flattening operation), and only keeps a limited number of open Futures at any given time. > > > >> In general, I felt that the documentation was missing any > >> commentary on concurrency and parallelisation. I was left > >> wondering what thread everything was happening on. > > > > ... as the default all stream processing element is backed by > > an actor ... > > The very fact that each component is backed by an Actor is news > to me. This wasn't at all obvious from the docs and actually > the "integration with actors" section made me think that streams > must be implemented completely differently if it needs an > integration layer! > Internals might or might not be actors. This is all depending on what materializer is used (currently there is only one kind), and even currently we have elements not backed by an Actor (CompletedSource and friends for example). This is completely internal stuff, we don't want to document all aspects of it. But yes, we can add more high-level info about it. > Actually, the "actor integration" really > means "low level streaming actors", doesn't it? > I am not sure what this means. > I would strongly > recommend making this a lot clearer as it helps to de-mystify the > implementation. > > Now knowing that each vertex in the graph is backed by an actor, > I wonder if in "balancing work to fixed pools" the Balance is > just adding a router actor with a balance strategy? > No, it is completely different, routers are not backpressure aware so they cannot be used here. We implemented these elements as actors, but it is not necessary to do so. > The > convenience method I suggested above could simply create a router > to multiple instances of the same actor with a parallelism bound. > Well, that is what the recipe does with the Balance and Merge elements. > I'm not entirely sure why one would need a Merge strategy for > that, although the option to preserve input order at the output > would be good for some circumstances (which could apply pushback > in preference to growing the queue of out-of-order results). > This is why I kept it as a recipe. If you create a strict round robin Balance and a strict round robin Merge on the other side, the parallelization step will keep the sequence. But you might want to add dropping elements, or batching elements, whatnot. The point of the recipe is that you can tailor it completely to your needs because you see how it works instead of just calling a doMagicForMe() method :) > > In addition, this revelation helps me to understand the > performance considerations of using akka-streams. Knowing this, > it would only be appropriate for something I would've considered > suitable (from a performance point of view) for hand-crafted flow > control in akka before streams was released. The main advantage > of akka-streams is therefore that it has dramatically lowered the > barrier of entry for writing Actor code with flow control. > Exactly! > > > > Thanks for this explanation Endre, I hope to see even more > innovation in the coming releases and improved documentation. > Cheers, -Endre > > Best regards, > Sam > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
