[
https://issues.apache.org/jira/browse/TINKERPOP-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15936962#comment-15936962
]
Matthew Stahl commented on TINKERPOP-1655:
------------------------------------------
Yikes. Not a bug then, but a *big* limitation.
Since my tree has the vertex ids, I suppose I can post-process to look up
vertex properties. Not ideal, but should work. OLAP to do the bulk of the
work, and OLTP to decorate the results.
Alternatives?
> SparkGraphComputer returns vertices without properties
> ------------------------------------------------------
>
> Key: TINKERPOP-1655
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1655
> Project: TinkerPop
> Issue Type: Bug
> Affects Versions: 3.3.0
> Environment: /usr/lib/spark/jars/spark-core_2.11-2.0.2.jar
> Reporter: Matthew Stahl
> Attachments: process-tree.json
>
>
> Spark 2.0 + tinkerpop-3.3.0
> Simple program which pulls out the 1st vertex in the grateful-dead.kryo
> dataset and prints the property keys works with the standard computer, but
> when processed using the SparkGraphComputer, the set of keys is empty.
> {code}
> // pre-requisite:
> // sudo -u zeppelin hadoop fs -copyFromLocal /tmp/grateful-dead.kryo
> grateful-dead.kryo
>
> val inputHdfsLocation = "grateful-dead.kryo"
> val props = Map[String, String](
> "gremlin.graph" ->
> "org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph"
> , "gremlin.hadoop.graphReader" ->
> "org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat"
> , "gremlin.hadoop.inputLocation" -> inputHdfsLocation
> , "gremlin.hadoop.outputLocation" -> "output"
> , "gremlin.hadoop.jarsInDistributedCache" -> "true"
> , "spark.master" -> "local[1]"
> , "spark.executor.memory" -> "1g"
> , "spark.serializer" ->
> "org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer"
> // , "spark.kryo.registrator" ->
> "org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator"
> )
>
> import org.apache.commons.configuration._
>
> val conf = new BaseConfiguration()
> props.foreach( kv => conf.addProperty(kv._1, kv._2))
>
> import org.apache.tinkerpop.gremlin.process.computer._
> import org.apache.tinkerpop.gremlin.spark.process.computer._
> import org.apache.tinkerpop.gremlin.structure.util._
> val graph = GraphFactory.open(conf)
> val v = graph.traversal().V().next(1).get(0)
> printf("vertex id = %s, keys = %s\n", v.id, v.keys())
>
> val computer = Computer.compute(classOf[SparkGraphComputer])
> val v2 = graph.traversal().withComputer(computer).V().next(1).get(0)
> printf("vertex id = %s, keys = %s\n", v2.id, v2.keys())
> {code}
> Above produces:
> {code}
> inputHdfsLocation: String = grateful-dead.kryo
> props: scala.collection.immutable.Map[String,String] = Map(spark.serializer
> -> org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer,
> gremlin.hadoop.inputLocation -> grateful-dead.kryo,
> gremlin.hadoop.jarsInDistributedCache -> true, gremlin.hadoop.graphReader ->
> org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat,
> gremlin.graph -> org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph,
> gremlin.hadoop.outputLocation -> output, spark.master -> local[1],
> spark.executor.memory -> 1g)
> import org.apache.commons.configuration._
> conf: org.apache.commons.configuration.BaseConfiguration =
> org.apache.commons.configuration.BaseConfiguration@1849d0b7
> import org.apache.tinkerpop.gremlin.process.computer._
> import org.apache.tinkerpop.gremlin.spark.process.computer._
> import org.apache.tinkerpop.gremlin.structure.util._
> graph: org.apache.tinkerpop.gremlin.structure.Graph =
> hadoopgraph[gryoinputformat->no-writer]
> v: org.apache.tinkerpop.gremlin.structure.Vertex = v[1]
> vertex id = 1, keys = [name, songType, performances]
> computer: org.apache.tinkerpop.gremlin.process.computer.Computer =
> sparkgraphcomputer
> v2: org.apache.tinkerpop.gremlin.structure.Vertex = v[1]
> vertex id = 1, keys = []
> {code}
> Notice the empty set of keys when run w/ the SparkGraphComputer, but the
> correct set when using the standard computer
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)