RDD groupBy

kvvt Mon, 23 Feb 2015 10:41:06 -0800

In the snippet below,

graph.edges.groupBy[VertexId](f1).foreach {
  edgesBySrc => {
    f2(edgesBySrc).foreach {
      vertexId => {
        *println(vertexId)*
      }
    }
  }
}


"f1" is a function that determines how to group the edges (in my case it
groups by source vertex)
"f2" is another function that does some computation on the edges. It returns
an iterable (Iterable[VertexId]).

*Questions:*

1. The problem is that "println(vertexId)" doesn't printing anything. I have
made sure that "f2" doesn't return an empty iterable. I am not sure what I
am missing here.

2. I am assuming that "f2" is called for each group in parallel. Is this
correct? If not, what is the correct way to operate on each group in
parallel?


Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/RDD-groupBy-tp21773.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RDD groupBy

Reply via email to