Hi Richard,


On Fri, Feb 19, 2016 at 11:11 PM, Richard Rodseth <[email protected]>
wrote:

> Thought I'd start a new thread for my latest stumbling block, while I
> explore some options that don't feel great.
>
> Short version:
> flatMapMerge has a "breadth" parameter which limits the number of
> substreams in flight. groupBy() does not.
>

Yes, it does:
http://doc.akka.io/api/akka/2.4.2/index.html#akka.stream.scaladsl.Flow@groupBy[K](maxSubstreams:Int,f:Out=
>K):akka.stream.scaladsl.SubFlow[Out,Mat,FlowOps.this.Repr,FlowOps.this.Closed]

Are you on 2.4.2?



> If maxSubstreams is exceeded the stream will fail.
>

Ah, I see what you mean. There is no way to backpressure the upstream based
on group count since how would you know whether the next element belongs to
a new group or not? E.g. you limited your group count to 4, and then
suddenly receive something that belongs to group 5. Your use case
implicitly implies that you *already* have the groups available you don't
need to dynamically compute them. I am not sure if groupBy is the best
option in that case.



> I am grouping stream elements and writing each group to a file. How can I
> limit the number of open files?
>

You are asking the impossible here :) Think a bit through and you will
realize that it is mathematically impossible to
 - dynamically group to potentially M groups where every new upstream
element can belong to an arbitrary group
 - at the same time limit the number of groups to N < M without failing the
stream or dropping elements

So either you cannot limit the number of groups, or you need to make sure
you feed your groups sequentially, in which case a simple
splitWhen/splitAfter might be a better fit (with less parallelization
opportunities)

-Endre


> Background:
> In my case the stream elements are channels, channel-months,
> channel-month-intervaldata.
> I've coded up a possible solution in which I used grouped(n) to get a
> batch of channel-months, then mapAsync(1) to run a separate stream that
> gets the intervals for that batch of channel-months, groups them  and
> writes to files.
>
> def writeChannelMonths(channelMonths: List[ChannelAndInstantRange]):
> Future[Seq[FileWritten]]
> So the size of n in the upfront grouped(n) will have the effect of
> limiting files open at a time.
>
> But I'm wondering if there's something more elegant. Not sure if this
> ticket discussed elsewhere will help:
> https://github.com/akka/akka/issues/18969
> I think not, unless a max open files was built into the sink stage, and
> caused backpressure if exceeded.
> I sort of feel like a groupBy with two limits (max distinct groups and max
> active groups) is what I need.
>
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Akka Team
Typesafe - Reactive apps on the JVM
Blog: letitcrash.com
Twitter: @akkateam

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to