The correct statement would be: “groupBy does not automatically introduce any per-key parallelism.”
There are several other combinators that can be used to introduce parallelism between processing stages (like viaAsync, mapAsync and potentially the flatMapX methods—if the provided sources declare async boundaries). We try to keep the combinators and concepts as orthogonal and composable as possible. Regards, Roland > 5 feb 2016 kl. 16:24 skrev Richard Rodseth <[email protected]>: > > In an effort to be more succinct :) Is this a true statement? > > "groupBy does not automatically introduce any *per-key* parallelism, unless > followed by mapAsync" > > On Thu, Feb 4, 2016 at 3:05 PM, Richard Rodseth <[email protected] > <mailto:[email protected]>> wrote: > I guess I'm still a bit confused by parallelism in akka streams, but let me > describe what I have. > > Tenants have Sites which have Channels which have Intervals (start end value) > > My root source is a stream of TenantSiteChannelInfo (obtained from a join of > channels with their sites and tenants) > > I am successfully writing files, one per channel-month arranged by tenant and > site, eg. > > /archive/<tenantid>/<siteid>/<channelid>/<yearmonth>_<channelid>.txt > > using a flow like this > > def channelToBytesWritten(db: JdbcBackend.DatabaseDef, batchSize: Int, > requestedRange:InstantRange): Flow[TenantSiteChannelInfo,(String,Long),Unit] > = { > > val result = Flow[TenantSiteChannelInfo] > > .via(Transformations.channelToChannelMonths(requestedRange)) // uses > flatMapConcat > > .via(Transformations.channelMonthToChannelMonthIntervals(db, > batchSize)) // uses flatMapConcat > > .via(Transformations.channelMonthIntervalToBytesWritten) // see below > > result > > } > > Transformations.channelMonthIntervalToBytesWritten does a > groupBy.prefixAndTail(1).mapAsync as discussed in a recent thread, where the > key is (year-month,channelid) > > The result is I see the files showing up in the Finder in parallel, which is > OK and fun to watch. The number of distinct keys is rather large but I could > presumably throttle the source channels if necessary to address that. > > But suppose I just/also wanted to process each *channel* (the very first type > of stream element) in parallel? Does inserting a groupBy(_channelId).viaAsync > at the beginning of this flow and merging substreams at the end achieve that, > or is running the latter half of the flow inside mapAsync the only/best way? > > If the groupBy wouldn't achieve per-channel parallelism, is it fair to say > that groupBy only has utility if followed by some sort of "folding", as in > the word count example or the file writing above, or if the receiving sink > (shared by all substreams, an actor perhaps) can key off the same key in some > way. > > Thanks. > > > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ <http://akka.io/docs/> > >>>>>>>>>> Check the FAQ: > >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> <http://doc.akka.io/docs/akka/current/additional/faq.html> > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > >>>>>>>>>> <https://groups.google.com/group/akka-user> > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/akka-user > <https://groups.google.com/group/akka-user>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. Dr. Roland Kuhn Akka Tech Lead Typesafe <http://typesafe.com/> – Reactive apps on the JVM. twitter: @rolandkuhn <http://twitter.com/#!/rolandkuhn> -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
