[akka-user] Varying result when running akka flow in parellel

subopt1 Wed, 14 Dec 2016 14:11:42 -0800

HI,

I'm working with a flow that downloads data, parses json and adds ids to a 
set (dedupe). It's working just fine however when I modify the flow to run 
in parallel, I get different results.


Here's my graph:

val graph: RunnableGraph[Future[HashSet[Long]]] =
  Source.fromGraph(new MinuteSource(firstMinuteYesterday, 
firstMinuteYesterday.plusDays(1)))
  .via(dsl(parallelize = 4))
  .toMat(Sink.fold(new HashSet[Long]())((accSet, set) => {
    accSet ++ set
  }))(Keep.right)


val deduped: Set[Long] = Await.result(graph.run(), Duration.Inf)

println(s"seq size is ${deduped.size} in ${new Duration(start, new 
DateTime()).toString}")


The dsl looks like

def dsl(parallelize: Int) = Flow.fromGraph(GraphDSL.create() { implicit builder 
=>
  import GraphDSL.Implicits._

  val dispatcher = builder.add(Balance[DateTime](parallelize))
  val merger = builder.add(Merge[Set[Long]](parallelize))

  for (i <- 0 to parallelize - 1) {
    dispatcher.out(i) ~> consumptionFlow.async ~> merger.in(i)
  }

  FlowShape(dispatcher.in, merger.out)
})


Here are the results for different parallelize values:


// parallelize 1 -> seq size is 48560 in 175
// parallelize 2 -> seq size is 48531 in 117
// parallelize 4 -> seq size is 48481 in 107


The resulting set size varies based on the parallelize number. What's 
interesting is the set size values are consistent, across runs. Does this 
make sense to anyone? Thanks!

Andrew

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

[akka-user] Varying result when running akka flow in parellel

Reply via email to