To be clear, I didn't mean concurrent writes to the same file, I meant
writing multiple per-key files simultaneously. And I get that the example
given achieves that because of the mapAsync performed on the head of the
substream.
As far as general semantics of groupBy, I think I get it. Per Roland, the
different substreams use the same interpreter, so in the word count example
groupBy(...).fold(...), the folds are not executed concurrently, but as you
suggest, if one wraps the fold in a reusable flow, then
groupBy(...).viaAsync(MyFlows.fold..) introduces concurrency.
Loose ends: The documentation for groupBy talks about emitting a group,
which seems like a hold-over from pre-SubFlow days.
Observation: In a "noop" groupBy
val x2 =Source(List("one", "five", "three", "two", "four")).groupBy(10000,
_.length).mergeSubstreams.to(Sink.foreach { e => println(e) }).run()
the untransformed elements are emitted in their original order
Bug? If I replace mergeSubStreams with concatSubstreams in the above, the
program hangs.
Thanks for bearing with me on this.
On Wed, Feb 3, 2016 at 10:17 AM, Viktor Klang <[email protected]>
wrote:
> Put in async boundaries where you want to have them. And writing to file
> concurrently is likely not faster, but as always needs to be measured.
>
> On Wed, Feb 3, 2016 at 6:55 PM, Richard Rodseth <[email protected]>
> wrote:
>
>> Write sub streams to files as fast as possible. But this latest was just
>> me trying to understand groupBy. I'm unclear whether the substreams are
>> processed concurrently (in the case where there is no mapAsync). In other
>> words if I call to() to pipe all substreams to the same actor will the
>> actor receive interleaved items? And if not will the outputs at least be
>> computed in parallel?
>>
>> Sent from my phone - will be brief
>>
>> On Feb 3, 2016, at 9:42 AM, Viktor Klang <[email protected]> wrote:
>>
>> I don't understand the question: What are you trying to achieve?
>>
>> On Wed, Feb 3, 2016 at 5:55 PM, Richard Rodseth <[email protected]>
>> wrote:
>>
>>> Ok. I suppose I should examine the GroupBy or SubFlow source code, but
>>> if I understand correctly different stages will run concurrently (if fusing
>>> is off or async boundaries have been added), but there's not a separate
>>> actor for each substream in a SubFlow?
>>>
>>> On Wed, Feb 3, 2016 at 7:35 AM, Francesco Di Muccio <
>>> [email protected]> wrote:
>>>
>>>>
>>>> Il giorno mercoledì 3 febbraio 2016 15:30:29 UTC+1, rrodseth ha scritto:
>>>>>
>>>>> Thanks very much. Actually, I would argue this is preferable to what's
>>>>> in the template now, and both deserve a juicy comment!
>>>>>
>>>>> Does groupBy alone introduce any parallelism? With/without fusing? In
>>>>> this example, if there were n log levels rather than 5, would more than 5
>>>>> files be written concurrently?
>>>>>
>>>>> In this example the 5 (or n) files are written concurrently, but one
>>>> problem of this approach is that if mapAsync parallelism is less than 5 it
>>>> can deadlock the whole stream.
>>>>
>>>>
>>>>> With the proposed new two-input stage in akka, would parallel file
>>>>> writing be internal and configurable?
>>>>>
>>>>> This might be a good time to say that I am extremely impressed with
>>>>> the thought and ingenuity that has gone into this library, and love using
>>>>> it. Thanks!
>>>>>
>>>>> On Wed, Feb 3, 2016 at 2:56 AM, Roland Kuhn <[email protected]> wrote:
>>>>>
>>>>>> Yes, this is exactly what I was referring to, and I hope it is clear
>>>>>> that we don’t want to show this approach to users in the Activator
>>>>>> template—we need a better solution :-)
>>>>>>
>>>>>> Thanks Francesco!
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Roland
>>>>>>
>>>>>> 3 feb 2016 kl. 10:31 skrev Francesco Di Muccio <
>>>>>> [email protected]>:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Il giorno mercoledì 3 febbraio 2016 01:54:55 UTC+1, rrodseth ha
>>>>>> scritto:
>>>>>>>
>>>>>>> No worries. I wish I had the time and expertise to help.
>>>>>>>
>>>>>>> I don't mean to be a pest, but since my credibility with management
>>>>>>> is at stake [ :) ] can anyone suggest any ways I can tackle the
>>>>>>> problem of
>>>>>>> groupBy followed by write to file?
>>>>>>>
>>>>>>> I might be able to get away with draining each substream to a
>>>>>>> collection and then using mapAsync to run a separate stream (per-key) to
>>>>>>> write the file, returning a Future as required by mapAsync. Not unlike
>>>>>>> where the activator template has ended up, but I'd really rather not.
>>>>>>>
>>>>>>> But I also find myself wondering if I could do something with fold,
>>>>>>> where the initial value contains the Sink for that substream.
>>>>>>>
>>>>>>> Roland also alluded to a hack(?) using prefixAndTail(0), but I don't
>>>>>>> quite see what that would look like.
>>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I can try to help using prefixAndTail, I came up with this so far:
>>>>>>
>>>>>> FileIO.fromFile(logFile).
>>>>>> // parse chunks of bytes into lines
>>>>>> via(Framing.delimiter(ByteString(System.lineSeparator),
>>>>>> maximumFrameLength = 512, allowTruncation = true)).
>>>>>> map(_.utf8String).
>>>>>> map {
>>>>>> case line@LoglevelPattern(level) => (level, line)
>>>>>> case line@other => ("OTHER", line)
>>>>>> }.
>>>>>> // group them by log level
>>>>>> groupBy(5, _._1).
>>>>>> prefixAndTail(1).
>>>>>> // write lines of each group to a separate file
>>>>>> mapAsync(5) {
>>>>>> case (Seq((level, line)), source) =>
>>>>>> Source.single(line)
>>>>>> .concat(source.map(_._2))
>>>>>> .map(line => ByteString(line + "\n"))
>>>>>> .runWith(FileIO.toFile(new
>>>>>> java.io.File(s"target/log-$level.txt")))
>>>>>>
>>>>>> // Not sure that this will ever happen
>>>>>> case _ =>
>>>>>> Future.successful(0)
>>>>>> }.
>>>>>> mergeSubstreams.
>>>>>> runWith(Sink.onComplete { _ =>
>>>>>> system.shutdown()
>>>>>> })
>>>>>>
>>>>>> What prefixAndTail does (if I got it right) emits a pair where the
>>>>>> first value is a sequence of n elements (prefix), and
>>>>>> the second value is the rest (tail) lifted to a source. Here I'm
>>>>>> using prefixAndTail(1) because I need to extract the level
>>>>>> from the first element to calculate the filename.
>>>>>>
>>>>>> Hope it helps,
>>>>>> Francesco
>>>>>>
>>>>>>
>>>>>>> Clutching at straws!
>>>>>>>
>>>>>>> On Tue, Feb 2, 2016 at 12:27 AM, Roland Kuhn <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes, indeed, I didn’t have the time to get that link yesterday. If
>>>>>>>> anyone wants to work on that: contributions are *always* welcome!
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Roland
>>>>>>>>
>>>>>>>> 1 feb 2016 kl. 23:59 skrev Richard Rodseth <[email protected]>:
>>>>>>>>
>>>>>>>> For anyone following along, I believe this is the issue Roland
>>>>>>>> refers to
>>>>>>>>
>>>>>>>> https://github.com/akka/akka/issues/18969
>>>>>>>>
>>>>>>>> On Mon, Feb 1, 2016 at 2:28 PM, Richard Rodseth <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ouch. Thanks.
>>>>>>>>>
>>>>>>>>> On Mon, Feb 1, 2016 at 1:49 PM, Roland Kuhn <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Richard,
>>>>>>>>>>
>>>>>>>>>> this is not yet solved, and we have an issue tracking this in
>>>>>>>>>> akka/akka as well. It is not certain that we will be able to fix this
>>>>>>>>>> before 2.4.2 comes out or whether the API addition that is necessary
>>>>>>>>>> will
>>>>>>>>>> have to wait until 2.4.3.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>>
>>>>>>>>>> Roland
>>>>>>>>>>
>>>>>>>>>> 1 feb 2016 kl. 22:34 skrev Richard Rodseth <[email protected]>:
>>>>>>>>>>
>>>>>>>>>> I'm concerned that this might fall through the cracks since the
>>>>>>>>>> GitHub issue is written against the Activator template rather than
>>>>>>>>>> akka
>>>>>>>>>> itself.
>>>>>>>>>> If it's already solved in 2.4.x please let me know.
>>>>>>>>>> If anyone can offer any guidance for a workaround, I'd appreciate
>>>>>>>>>> it. In the linked discussion Roland mentioned something about
>>>>>>>>>> prefixAndTail
>>>>>>>>>> but said you wouldn't want that in a tutorial.
>>>>>>>>>> I also think it would be helpful for those who have seen both the
>>>>>>>>>> pre-SubFlow and post-SubFlow templates if the reason for this
>>>>>>>>>> unfortunate
>>>>>>>>>> regression was stated in the activator template.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jan 31, 2016 at 6:23 PM, Richard Rodseth <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> I have run into this issue
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/typesafehub/activator-akka-stream-scala/issues/37
>>>>>>>>>>>
>>>>>>>>>>> I want to group a stream and write each substream to a separate
>>>>>>>>>>> file. A pretty common use case, I'd imagine.
>>>>>>>>>>>
>>>>>>>>>>> The old version of the GroupLog example showed a groupBy()
>>>>>>>>>>> followed by a to()
>>>>>>>>>>>
>>>>>>>>>>> Because of the introduction of SubFlow, the new version had to
>>>>>>>>>>> introduce an unacceptable fold() into a strict collection
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/typesafehub/activator-akka-stream-scala/blob/c63afb5eecf50726c4417157efd26d697b621e99/src/main/scala/sample/stream/GroupLogFile.scala
>>>>>>>>>>>
>>>>>>>>>>> What are my options? ETA for a fix?
>>>>>>>>>>> Thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> >>>>>>>>>> Search the archives:
>>>>>>>>>> https://groups.google.com/group/akka-user
>>>>>>>>>> ---
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "Akka User List" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Dr. Roland Kuhn*
>>>>>>>>>> *Akka Tech Lead*
>>>>>>>>>> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
>>>>>>>>>> twitter: @rolandkuhn
>>>>>>>>>> <http://twitter.com/#!/rolandkuhn>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> >>>>>>>>>> Search the archives:
>>>>>>>>>> https://groups.google.com/group/akka-user
>>>>>>>>>> ---
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "Akka User List" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>> >>>>>>>>>> Search the archives:
>>>>>>>> https://groups.google.com/group/akka-user
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "Akka User List" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Dr. Roland Kuhn*
>>>>>>>> *Akka Tech Lead*
>>>>>>>> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
>>>>>>>> twitter: @rolandkuhn
>>>>>>>> <http://twitter.com/#!/rolandkuhn>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>> >>>>>>>>>> Search the archives:
>>>>>>>> https://groups.google.com/group/akka-user
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "Akka User List" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>> >>>>>>>>>> Search the archives:
>>>>>> https://groups.google.com/group/akka-user
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Akka User List" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Dr. Roland Kuhn*
>>>>>> *Akka Tech Lead*
>>>>>> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
>>>>>> twitter: @rolandkuhn
>>>>>> <http://twitter.com/#!/rolandkuhn>
>>>>>>
>>>>>> --
>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>> >>>>>>>>>> Check the FAQ:
>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>> >>>>>>>>>> Search the archives:
>>>>>> https://groups.google.com/group/akka-user
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Akka User List" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ:
>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>> >>>>>>>>>> Search the archives:
>>>> https://groups.google.com/group/akka-user
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ:
>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>> >>>>>>>>>> Search the archives:
>>> https://groups.google.com/group/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Cheers,
>> √
>>
>> --
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ:
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ:
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Cheers,
> √
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.