Hi Richard,
On Fri, Feb 19, 2016 at 11:11 PM, Richard Rodseth <[email protected]> wrote: > Thought I'd start a new thread for my latest stumbling block, while I > explore some options that don't feel great. > > Short version: > flatMapMerge has a "breadth" parameter which limits the number of > substreams in flight. groupBy() does not. > Yes, it does: http://doc.akka.io/api/akka/2.4.2/index.html#akka.stream.scaladsl.Flow@groupBy[K](maxSubstreams:Int,f:Out= >K):akka.stream.scaladsl.SubFlow[Out,Mat,FlowOps.this.Repr,FlowOps.this.Closed] Are you on 2.4.2? > If maxSubstreams is exceeded the stream will fail. > Ah, I see what you mean. There is no way to backpressure the upstream based on group count since how would you know whether the next element belongs to a new group or not? E.g. you limited your group count to 4, and then suddenly receive something that belongs to group 5. Your use case implicitly implies that you *already* have the groups available you don't need to dynamically compute them. I am not sure if groupBy is the best option in that case. > I am grouping stream elements and writing each group to a file. How can I > limit the number of open files? > You are asking the impossible here :) Think a bit through and you will realize that it is mathematically impossible to - dynamically group to potentially M groups where every new upstream element can belong to an arbitrary group - at the same time limit the number of groups to N < M without failing the stream or dropping elements So either you cannot limit the number of groups, or you need to make sure you feed your groups sequentially, in which case a simple splitWhen/splitAfter might be a better fit (with less parallelization opportunities) -Endre > Background: > In my case the stream elements are channels, channel-months, > channel-month-intervaldata. > I've coded up a possible solution in which I used grouped(n) to get a > batch of channel-months, then mapAsync(1) to run a separate stream that > gets the intervals for that batch of channel-months, groups them and > writes to files. > > def writeChannelMonths(channelMonths: List[ChannelAndInstantRange]): > Future[Seq[FileWritten]] > So the size of n in the upfront grouped(n) will have the effect of > limiting files open at a time. > > But I'm wondering if there's something more elegant. Not sure if this > ticket discussed elsewhere will help: > https://github.com/akka/akka/issues/18969 > I think not, unless a max open files was built into the sink stage, and > caused backpressure if exceeded. > I sort of feel like a groupBy with two limits (max distinct groups and max > active groups) is what I need. > > > -- > >>>>>>>>>> Read the docs: http://akka.io/docs/ > >>>>>>>>>> Check the FAQ: > http://doc.akka.io/docs/akka/current/additional/faq.html > >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user > --- > You received this message because you are subscribed to the Google Groups > "Akka User List" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/akka-user. > For more options, visit https://groups.google.com/d/optout. > -- Akka Team Typesafe - Reactive apps on the JVM Blog: letitcrash.com Twitter: @akkateam -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
