I'm a Scala / Spark / GraphX newbie, so may be missing something obvious.
I have a set of edges that I read into a graph. For an iterative
community-detection algorithm, I want to assign each vertex to a community
with the name of the vertex. Intuitively it seems like I should be able to
pull the vertexID out of the VertexRDD and build a new VertexRDD with 2 Int
attributes. Unfortunately I'm not finding the recipe to unpack the
VertexRDD into the vertexID and attribute pieces.
The code snippet that builds the graph looks like
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val G = GraphLoader.edgeListFile(sc,"[[...]]clique_5_2_3.edg")
Poking at G to see what it looks like, I see
scala> :type G.vertices
org.apache.spark.graphx.VertexRDD[Int]
scala> G.vertices.collect()
res1: Array[(org.apache.spark.graphx.VertexId, Int)] = Array((10002,1),
(4,1), (10001,1), (10000,1), (0,1), (1,1), (10003,1), (3,1), (10004,1),
(2,1))
I've tried several ways to pull out just the first element of each tuple
into a new variable, with no success.
scala> var (x: Int) = G.vertices
<console>:21: error: type mismatch;
found : org.apache.spark.graphx.VertexRDD[Int]
required: Int
var (x: Int) = G.vertices
^
scala> val x: Int = G.vertices._1
<console>:21: error: value _1 is not a member of
org.apache.spark.graphx.VertexRDD[Int]
val x: Int = G.vertices._1
^
What am I missing?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/noob-how-to-extract-different-members-of-a-VertexRDD-tp12399.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]