[
https://issues.apache.org/jira/browse/TINKERPOP3-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983129#comment-14983129
]
ASF GitHub Bot commented on TINKERPOP3-925:
-------------------------------------------
Github user RussellSpitzer commented on a diff in the pull request:
https://github.com/apache/incubator-tinkerpop/pull/129#discussion_r43542704
--- Diff:
spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java
---
@@ -130,12 +142,13 @@ public GraphComputer config(final String key, final
Object value) {
try {
graphRDD =
hadoopConfiguration.getClass(Constants.GREMLIN_SPARK_GRAPH_INPUT_RDD,
InputFormatRDD.class, InputRDD.class)
.newInstance()
- .readGraphRDD(this.sparkConfiguration,
sparkContext)
- .setName("graphRDD")
+ .readGraphRDD(apacheConfiguration,
sparkContext)
+
.setName(sparkConfiguration.get(Constants.GREMLIN_HADOOP_OUTPUT_LOCATION,
"graphRDD"))
.cache();
--- End diff --
I believe this "cache" is basically saving the initial state of the graph
after having been read from the source. This will be helpful if there are two
separate DAGs being built from the graphRDD base, but we need to be sure to
unpersist this in all cases.
> Use persisted SparkContext to persist an RDD across Spark jobs.
> ---------------------------------------------------------------
>
> Key: TINKERPOP3-925
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-925
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: hadoop
> Affects Versions: 3.0.2-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.1.0-incubating
>
>
> If a provider is using Spark, they are currently forced to have HDFS be used
> to store intermediate RDD data. However, if they plan on using that data in a
> {{GraphComputer}} "job chain," then they should be able to lookup a
> {{.cached()}} RDD by name.
> Create a {{inputGraphRDD.name}} and {{outputGraphRDD.name}} to make it so
> that the configuration references {{SparkContext.getPersitedRDDs()}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)