[
https://issues.apache.org/jira/browse/TINKERPOP-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047101#comment-15047101
]
ASF GitHub Bot commented on TINKERPOP-1023:
-------------------------------------------
GitHub user okram opened a pull request:
https://github.com/apache/incubator-tinkerpop/pull/173
TINKERPOP-1023: Add a spark variable in SparkGremlinPlugin like we do hdfs
for HadoopGremlinPlugin
https://issues.apache.org/jira/browse/TINKERPOP-1023
Like `hdfs` there is now `spark` which allows the user to manage their
persisted contexts. In essence, the Spark Server looks like a file system with
(named) RDDs accessible. For instance, you can `spark.ls()`, `spark.rm()`,
`spark.describe()`. I added a `SparkGremlinPluginTest` which ensures that all
the proper imports/etc. work in the Console. I also added the information the
reference docs. I published the reference docs so people can see it in action:
http://tinkerpop.apache.org/docs/3.1.1-SNAPSHOT/reference/#sparkgraphcomputer
(scroll down to "Using A Persisted Context" section)
VOTE +1. (`mvn clean install` and Spark integration tests)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1023
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-tinkerpop/pull/173.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #173
----
commit 9d6467c8f48cd34e83270d3f3eabbcce9ce74f05
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T15:00:47Z
fist push on a Spark object for managing persisted RDDs. Not finished yet.
commit 4d1d8c90cead1aac4ca61b4510ea533c39b1ad7a
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T15:29:26Z
Merge branch 'master' into TINKERPOP-1023
commit f8fabe20108f4cec8f4c50c7f7bf6523c112acac
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T16:28:19Z
added Spark persited RDD utility that can be spark.ls(), spark.head(),
spark.rm(), spark.describe(), etc. in the Console. Really cool. Added a
SparkGremlinPluginTest that verifies everything works as expected. Updated docs
explaining the new tool.
commit debea174494d9c22fbcfca1cd505328b9a998e08
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T17:12:01Z
added spark RDD utility to docs.
commit 9be5a6d35e023e921017ecb44b80d767095f916a
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T17:16:51Z
minor section rename.
commit 97828f10550dff00fdb4474d2b36bff30472182c
Author: Marko A. Rodriguez <[email protected]>
Date: 2015-12-08T17:24:39Z
removed debugging work in SparkTest.
----
> Add a spark variable in SparkGremlinPlugin like we do hdfs for
> HadoopGremlinPlugin
> ----------------------------------------------------------------------------------
>
> Key: TINKERPOP-1023
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1023
> Project: TinkerPop
> Issue Type: Improvement
> Components: hadoop
> Affects Versions: 3.1.0-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.1.1-incubating
>
>
> It would be good if from the Gremlin Console we could do things like this:
> {code}
> gremlin> spark.getRDDs()
> gremlin> spark.removeRDD("graphRDD")
> gremlin> spark.getMaster()
> gremlin> spark.isPersisted()
> {code}
> With the ability to have persisted context's, its confusing as to what is
> persisted and what is not. With a {{spark}} like we have with {{hdfs}} it
> will make it more clear.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)