[ 
https://issues.apache.org/jira/browse/TINKERPOP-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188613#comment-16188613
 ] 

ASF GitHub Bot commented on TINKERPOP-1786:
-------------------------------------------

Github user vtslab commented on a diff in the pull request:

    https://github.com/apache/tinkerpop/pull/721#discussion_r142223415
  
    --- Diff: hadoop-gremlin/conf/hadoop-gryo.properties ---
    @@ -29,8 +29,8 @@ gremlin.hadoop.outputLocation=output
     spark.master=local[4]
     spark.executor.memory=1g
     
spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
    +gremlin.spark.persistContext=true
    --- End diff --
    
    Good question, I had not justified this yet. My original reason was that 
stopping both the SparkContext and the gremlin console as in the docs 
generation, can lead to race conditions in spark-yarn with random connection 
exceptions showing up in the console output in the docs. But as a bonus, 
follow-up OLAP queries get answered much faster as you skip the overhead for 
getting resources from yarn. This is what is also done in Apache Zeppelin, 
Spark shell and the like.
    
    The alternative is to set the property in the console together with the 
other properties. This would require some more explanation and configuration 
work afterwards to/from the recipe users, but would leave the properties file 
untouched. I like the current proposal better, but I am fine with both.


> Recipe and missing manifest items for Spark on Yarn
> ---------------------------------------------------
>
>                 Key: TINKERPOP-1786
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1786
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: hadoop
>    Affects Versions: 3.3.0, 3.1.8, 3.2.6
>         Environment: gremlin-console
>            Reporter: Marc de Lignie
>            Priority: Minor
>             Fix For: 3.2.7, 3.3.1
>
>
> Thorough documentation for running OLAP queries on Spark on Yarn has been 
> missing, keeping some users from getting the benefits of this nice feature of 
> the Tinkerpop stack and resulting in a significant number of questions on the 
> gremlin users list.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to