Marko A. Rodriguez created TINKERPOP-1072:
---------------------------------------------
Summary: Allow the user to set persistence options using
StorageLevel.valueOf()
Key: TINKERPOP-1072
URL: https://issues.apache.org/jira/browse/TINKERPOP-1072
Project: TinkerPop
Issue Type: Improvement
Components: hadoop
Affects Versions: 3.1.0-incubating
Reporter: Marko A. Rodriguez
Assignee: Marko A. Rodriguez
Fix For: 3.1.1-incubating
I always thought there was a Spark option to say stuff like
{{default.persist=DISK_SER_1}}, but I can't seem to find it.
If no such option exists, then we should add it to Spark-Gremlin. For instance:
{code}
gremlin.spark.storageLevel=DISK_ONLY
{code}
See: http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence
Then we would need to go through and where we have {{...cache()}} calls, they
need to be changed to
{{....persist(StorageLevel.valueOf(conf.get("gremlin.spark.storageLevel","MEMORY_ONLY")}}.
The question then becomes, do we provide flexibility where the user can have
the program caching different from the persisted RDD caching :|.... Too many
configurations sucks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)