[jira] [Created] (TINKERPOP-2081) PersistedOutputRDD materialises rdd lazily with Spark 2.x
Artem Aliev created TINKERPOP-2081: -- Summary: PersistedOutputRDD materialises rdd lazily with Spark 2.x Key: TINKERPOP-2081 URL: https://issues.apache.org/jira/browse/TINKERPOP-2081 Project: TinkerPop Issue Type: Bug Affects Versions: 3.3.4 Reporter: Artem Aliev PersistedOutputRDD is not actually persist RDD in spark memory but mark it for lazy caching in the future. It looks like caching was eager in Spark 1.6, but in spark 2.0 it lazy. The lazy caching looks wrong for this case, the source graph could be changed after snapshot is created and snapshot should not be affected by that changes. The fix itself is simple: PersistedOutputRDD should call any spark action to trigger eager caching. For example count() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] tinkerpop issue #789: TINKERPOP-1870 Made VertexTraverserSet more generic to...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/789 +1 I reran my small perf test and get the same result with this new structure, but it looks more solid and inline with other code. ---
[GitHub] tinkerpop issue #781: Tinkerpop 1870 master
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/781 rebased ---
[GitHub] tinkerpop issue #778: TINKERPOP-1870: Extends TraverserSet to have Vertex in...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/778 rebased ---
[GitHub] tinkerpop issue #780: improve performace by not handling excepptions
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/780 retargeted/rebase to latest tp32 ---
[GitHub] tinkerpop pull request #778: TINKERPOP-1870: Extends TraverserSet to have Ve...
Github user artem-aliev commented on a diff in the pull request: https://github.com/apache/tinkerpop/pull/778#discussion_r162065650 --- Diff: gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/gryo/GryoVersion.java --- @@ -305,6 +307,7 @@ public String getVersion() { add(GryoTypeReg.of(RangeGlobalStep.RangeBiOperator.class, 114)); add(GryoTypeReg.of(OrderGlobalStep.OrderBiOperator.class, 118, new JavaSerializer())); add(GryoTypeReg.of(ProfileStep.ProfileBiOperator.class, 119)); +add(GryoTypeReg.of(VertexTraverserSet.class, 173)); --- End diff -- master PR: https://github.com/apache/tinkerpop/pull/781 I have make both tp32 and tp33 ids the same. so I skipped 171,172 in tp32 ---
[GitHub] tinkerpop pull request #781: Tinkerpop 1870 master
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/781 Tinkerpop 1870 master You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1870-master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/781.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #781 commit a6ac037f0582187a47186264e59a1e42a14a7d1f Author: artemaliev Date: 2018-01-15T15:23:49Z TINKERPOP-1870: Extends TraverserSet to have Vertex index for remote traversers That replaces linear search in reversal traversal interator with hashtable lookup. commit ff05f07f082c87c788827143fc0d64ec38073005 Author: artemaliev Date: 2018-01-17T14:17:47Z Merge branch 'TINKERPOP-1870' into TINKERPOP-1870-master ---
[GitHub] tinkerpop pull request #780: improve performace by not handling excepptions
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/780 improve performace by not handling excepptions You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1871 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/780.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #780 commit 7197a89e696a9cb9ff93081510de17e5d016b858 Author: artemaliev Date: 2017-12-29T09:49:28Z improve performace by not handling excepptions ---
[jira] [Created] (TINKERPOP-1871) Exception handling is slow in element ReferenceElement creation
Artem Aliev created TINKERPOP-1871: -- Summary: Exception handling is slow in element ReferenceElement creation Key: TINKERPOP-1871 URL: https://issues.apache.org/jira/browse/TINKERPOP-1871 Project: TinkerPop Issue Type: Improvement Affects Versions: 3.3.1 Reporter: Artem Aliev Following exception happen for each vertex in OLAP and takes ~10% of execution time. [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/util/reference/ReferenceElement.java#L48] The exception is always thrown for ComputerGraph.ComputerAdjacentVertex class. So the check could be added to improve performance. This is 3.3.x issue only, 3.2 has no this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TINKERPOP-1870) n^2 synchronious operation in OLAP WorkerExecutor.execute() method
[ https://issues.apache.org/jira/browse/TINKERPOP-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Aliev updated TINKERPOP-1870: --- Affects Version/s: 3.2.7 3.3.1 > n^2 synchronious operation in OLAP WorkerExecutor.execute() method > -- > > Key: TINKERPOP-1870 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1870 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.2.7, 3.3.1 > Reporter: Artem Aliev >Priority: Major > Attachments: findTraverser1.png, findTraverser2.png, > findTraverserFixed.png > > > [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93] > This block of code iterates over all remote traverses to select one related > to the current vertex and remove it. This operation is repeated for the next > vertex and so one. For following example query it means n^2 operations (n is > number of vertices). All of them in sync block. multi core spark executor > will do this operations serial. > {code} > g.V().emit().repeat(both().dedup()).count().next() > {code} > See jvisualvm screenshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TINKERPOP-1870) n^2 synchronious operation in OLAP WorkerExecutor.execute() method
[ https://issues.apache.org/jira/browse/TINKERPOP-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326941#comment-16326941 ] Artem Aliev commented on TINKERPOP-1870: I wrapped the block into findVertexTraverser() method to see its timing in profiler. See attached profiler screenshots So it takes 20-30% of execution time in single 6 core executor. The performance was was greatly improved on my 10k vertex graph: Before fix: {code} gremlin> g.V().count() ==>1 gremlin> g.E().count() ==>16 gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>52349.640981 gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>53800.89875495 gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>50643.744645 {code} After fix: {code} gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>42062.945477 gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>38419.46317196 gremlin> clock(1) \{g.V().emit().repeat(both().dedup()).count().next()} ==>34336.707208 {code} {code} >mvn clean install [INFO] BUILD SUCCESS {code} > n^2 synchronious operation in OLAP WorkerExecutor.execute() method > -- > > Key: TINKERPOP-1870 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1870 > Project: TinkerPop > Issue Type: Improvement >Reporter: Artem Aliev >Priority: Major > Attachments: findTraverser1.png, findTraverser2.png, > findTraverserFixed.png > > > [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93] > This block of code iterates over all remote traverses to select one related > to the current vertex and remove it. This operation is repeated for the next > vertex and so one. For following example query it means n^2 operations (n is > number of vertices). All of them in sync block. multi core spark executor > will do this operations serial. > {code} > g.V().emit().repeat(both().dedup()).count().next() > {code} > See jvisualvm screenshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (TINKERPOP-1870) n^2 synchronious operation in OLAP WorkerExecutor.execute() method
[ https://issues.apache.org/jira/browse/TINKERPOP-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Aliev updated TINKERPOP-1870: --- Attachment: findTraverserFixed.png findTraverser2.png findTraverser1.png > n^2 synchronious operation in OLAP WorkerExecutor.execute() method > -- > > Key: TINKERPOP-1870 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1870 > Project: TinkerPop > Issue Type: Improvement > Reporter: Artem Aliev >Priority: Major > Attachments: findTraverser1.png, findTraverser2.png, > findTraverserFixed.png > > > [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93] > This block of code iterates over all remote traverses to select one related > to the current vertex and remove it. This operation is repeated for the next > vertex and so one. For following example query it means n^2 operations (n is > number of vertices). All of them in sync block. multi core spark executor > will do this operations serial. > {code} > g.V().emit().repeat(both().dedup()).count().next() > {code} > See jvisualvm screenshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TINKERPOP-1870) n^2 synchronious operation in OLAP WorkerExecutor.execute() method
[ https://issues.apache.org/jira/browse/TINKERPOP-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326356#comment-16326356 ] Artem Aliev commented on TINKERPOP-1870: The fix I provided could be simplified. VertexRemoteSet extends RemoteSet now for backward compatibility, just in case someone use it directly in VertexPrograms. If it is fully internal structure It could become simple synhronious MultiValue hash map. The map preserves traverser order for each vertex. > n^2 synchronious operation in OLAP WorkerExecutor.execute() method > -- > > Key: TINKERPOP-1870 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1870 > Project: TinkerPop > Issue Type: Improvement > Reporter: Artem Aliev >Priority: Major > > [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93] > This block of code iterates over all remote traverses to select one related > to the current vertex and remove it. This operation is repeated for the next > vertex and so one. For following example query it means n^2 operations (n is > number of vertices). All of them in sync block. multi core spark executor > will do this operations serial. > {code} > g.V().emit().repeat(both().dedup()).count().next() > {code} > See jvisualvm screenshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] tinkerpop pull request #778: TINKERPOP-1870: Extends TraverserSet to have Ve...
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/778 TINKERPOP-1870: Extends TraverserSet to have Vertex index for remote ⦠â¦traversers That replaces linear search in reversal traversal interator with hashtable lookup. You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1870 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/778.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #778 commit f1dd251174dd1f04b6ed722b1956d5c3291a842a Author: artemaliev Date: 2018-01-15T15:23:49Z TINKERPOP-1870: Extends TraverserSet to have Vertex index for remote traversers That replaces linear search in reversal traversal interator with hashtable lookup. ---
[jira] [Created] (TINKERPOP-1870) n^2 synchronious operation in OLAP WorkerExecutor.execute() method
Artem Aliev created TINKERPOP-1870: -- Summary: n^2 synchronious operation in OLAP WorkerExecutor.execute() method Key: TINKERPOP-1870 URL: https://issues.apache.org/jira/browse/TINKERPOP-1870 Project: TinkerPop Issue Type: Improvement Reporter: Artem Aliev [https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/computer/traversal/WorkerExecutor.java#L80-L93] This block of code iterates over all remote traverses to select one related to the current vertex and remove it. This operation is repeated for the next vertex and so one. For following example query it means n^2 operations (n is number of vertices). All of them in sync block. multi core spark executor will do this operations serial. {code} g.V().emit().repeat(both().dedup()).count().next() {code} See jvisualvm screenshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/734 I have fixed test failures. TinkerPopComputer does not call ComputerPorgram.execute methods if spit has no vertices. For example: modern graph has 6 vertices but computer has 8 cores, there will be two empty splits. TraversalVertexProgram use execute step to setup next profiling step, so it is not setup side effects properly for empty splits. So tests did not filed in docker but failed on computer with more then 6 cores. The fix add check that profile side effects were regester properly before using ---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/734 Something strange happened, during rebases. I'll fix. ---
[GitHub] tinkerpop issue #734: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/734 The fix add iteration time to the appropriate step. Iteration time is a time between workerIterationStart() and workerIterationEnd() callbacks. So before fix the timing looks like ``` gremlin> g.V().out().out().count().profile() ==>Traversal Metrics Step Count Traversers Time (ms)% Dur = GraphStep(vertex,[])9962 9962 70.87348.95 VertexStep(OUT,vertex) 1012657 3745 37.13225.65 VertexStep(OUT,edge) 2101815 6192 36.75125.39 CountGlobalStep1 1 0.018 0.01 >TOTAL - - 144.775- ``` While query runs 10s seconds. After the fix: ``` gremlin> g.V().out().out().count().profile() ==>Traversal Metrics Step Count Traversers Time (ms)% Dur = GraphStep(vertex,[])9962 9962 14186.80997.43 VertexStep(OUT,vertex) 1012657 3745 340.051 2.34 VertexStep(OUT,edge) 2101815 6192 33.684 0.23 CountGlobalStep1 1 0.004 0.00 >TOTAL - - 14560.549- ``` That shows that most of the time for this OLAP query was spend in the initial iteration (actually star graph creation). There still could be some inconsistencies because 1. No computer specific setup time is measured 2. Spark has a lot of lazy staff, so most of Spark RDD setup are counted as a first step. 3. The algorithm could fail to assign timing to the right step in bas of sophisticated queries. By the way new timing is pretty close to the wall clock time. ---
[GitHub] tinkerpop issue #733: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/733 See #734 ---
[GitHub] tinkerpop pull request #733: TINKERPOP-1801: fix profile() timing in OLAP by...
Github user artem-aliev closed the pull request at: https://github.com/apache/tinkerpop/pull/733 ---
[GitHub] tinkerpop issue #733: TINKERPOP-1801: fix profile() timing in OLAP by adding...
Github user artem-aliev commented on the issue: https://github.com/apache/tinkerpop/pull/733 that should be against tp32 branch ---
[GitHub] tinkerpop pull request #734: TINKERPOP-1801: fix profile() timing in OLAP by...
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/734 TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati⦠â¦on timings to step metrics this is a simple fix that do not change any API You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/734.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #734 commit 827ea9cfd57202612518e5e6bcff18f601dd2018 Author: artemaliev Date: 2017-10-17T18:00:31Z TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration timings to step metrics this is a simple fix that do not change any API ---
[jira] [Commented] (TINKERPOP-1801) OLAP profile() step return incorrect timing
[ https://issues.apache.org/jira/browse/TINKERPOP-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208065#comment-16208065 ] Artem Aliev commented on TINKERPOP-1801: That is simple way to fix it, without new API. Let's discuss better apporaches, I did not add new tests, I find set of them in a test suite. I slide have mine one but it is unstable because of timings. > OLAP profile() step return incorrect timing > > > Key: TINKERPOP-1801 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1801 > Project: TinkerPop > Issue Type: Bug >Affects Versions: 3.3.0, 3.2.6 >Reporter: Artem Aliev > > Graph ProfileStep calculates time of next()/hasNext() calls, expecting > recursion. > But Message passing/RDD joins is used by GraphComputer. > So next() does not recursively call next steps, but message is generated. And > most of the time is taken by message passing (RDD join). > Thus on graph computer the time between ProfileStep should be measured, not > inside it. > The other approach is to get Spark statistics with SparkListener and add > spark stages timings into profiler metrics. that will work only for spark but > will give better representation of step costs. > The simple fix is measuring time between OLAP iterations and add it to the > profiler step. > This will not take into account computer setup time, but will be precise > enough for long running queries. > To reproduce: > tinkerPop 3.2.6 gremlin: > {code} > plugin activated: tinkerpop.server > plugin activated: tinkerpop.utilities > plugin activated: tinkerpop.spark > plugin activated: tinkerpop.tinkergraph > gremlin> graph = > GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties') > gremlin> g = graph.traversal().withComputer(SparkGraphComputer) > ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], > sparkgraphcomputer] > gremlin> g.V().out().out().count().profile() > ==>Traversal Metrics > Step Count > Traversers Time (ms)% Dur > = > GraphStep(vertex,[]) 808 >808 2.02518.35 > VertexStep(OUT,vertex) 8049 >562 4.43040.14 > VertexStep(OUT,edge) 327370 > 7551 4.58141.50 > CountGlobalStep1 > 1 0.001 0.01 > >TOTAL - > - 11.038- > gremlin> clock(1){g.V().out().out().count().next() } > ==>3421.92758 > gremlin> > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] tinkerpop pull request #733: TINKERPOP-1801: fix profile() timing in OLAP by...
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/733 TINKERPOP-1801: fix profile() timing in OLAP by adding worker iterati⦠â¦on timings to step metrics this is a simple fix that do not change any API You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1801 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/733.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #733 commit 827ea9cfd57202612518e5e6bcff18f601dd2018 Author: artemaliev Date: 2017-10-17T18:00:31Z TINKERPOP-1801: fix profile() timing in OLAP by adding worker iteration timings to step metrics this is a simple fix that do not change any API ---
[jira] [Created] (TINKERPOP-1801) OLAP profile() step return incorrect timing
Artem Aliev created TINKERPOP-1801: -- Summary: OLAP profile() step return incorrect timing Key: TINKERPOP-1801 URL: https://issues.apache.org/jira/browse/TINKERPOP-1801 Project: TinkerPop Issue Type: Bug Affects Versions: 3.2.6, 3.3.0 Reporter: Artem Aliev Graph ProfileStep calculates time of next()/hasNext() calls, expecting recursion. But Message passing/RDD joins is used by GraphComputer. So next() does not recursively call next steps, but message is generated. And most of the time is taken by message passing (RDD join). Thus on graph computer the time between ProfileStep should be measured, not inside it. The other approach is to get Spark statistics with SparkListener and add spark stages timings into profiler metrics. that will work only for spark but will give better representation of step costs. The simple fix is measuring time between OLAP iterations and add it to the profiler step. This will not take into account computer setup time, but will be precise enough for long running queries. To reproduce: tinkerPop 3.2.6 gremlin: {code} plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.spark plugin activated: tinkerpop.tinkergraph gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties') gremlin> g = graph.traversal().withComputer(SparkGraphComputer) ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer] gremlin> g.V().out().out().count().profile() ==>Traversal Metrics Step Count Traversers Time (ms)% Dur = GraphStep(vertex,[]) 808 808 2.02518.35 VertexStep(OUT,vertex) 8049 562 4.43040.14 VertexStep(OUT,edge) 327370 7551 4.58141.50 CountGlobalStep1 1 0.001 0.01 >TOTAL - - 11.038- gremlin> clock(1){g.V().out().out().count().next() } ==>3421.92758 gremlin> {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks
[ https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16167499#comment-16167499 ] Artem Aliev commented on TINKERPOP-1783: The work around I proposed is incorrect. The correct behaviour is "user come to random vertex from the sink vertex" > PageRank gives incorrect results for graphs with sinks > -- > > Key: TINKERPOP-1783 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1783 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.3.0, 3.1.8, 3.2.6 >Reporter: Artem Aliev > > {quote} Sink vertices (those with no outgoing edges) should evenly distribute > their rank to the entire graph but in the current implementation it is just > lost. > {quote} > Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm > {quote} In the original form of PageRank, the sum of PageRank over all pages > was the total number of pages on the web at that time > {quote} > I found the issue, while comparing results with the spark graphX. > So this is a copy of https://issues.apache.org/jira/browse/SPARK-18847 > How to reproduce: > {code} > gremlin> graph = TinkerFactory.createModern() > gremlin> g = graph.traversal().withComputer() > gremlin> > g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum() > ==>1.318625 > gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum() > ==>3.4497 > #inital values: > gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum() > ==>6.0 > {code} > They fixed the issue by normalising values after each step. > The other way to fix is to send the message to it self (stay on the same > page). > To workaround the problem just add self pointing edges: > {code} > gremlin>g.V().as('B').addE('knows').from('B') > {code} > Then you'll get always correct sum. But I'm not sure it is a proper > assumption. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks
Artem Aliev created TINKERPOP-1783: -- Summary: PageRank gives incorrect results for graphs with sinks Key: TINKERPOP-1783 URL: https://issues.apache.org/jira/browse/TINKERPOP-1783 Project: TinkerPop Issue Type: Bug Affects Versions: 3.2.6, 3.1.8, 3.3.0 Reporter: Artem Aliev {quote} Sink vertices (those with no outgoing edges) should evenly distribute their rank to the entire graph but in the current implementation it is just lost. {quote} Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm {quote} In the original form of PageRank, the sum of PageRank over all pages was the total number of pages on the web at that time {quote} I found the issue, while comparing results with the spark graphX. So this is a copy of https://issues.apache.org/jira/browse/SPARK-18847 How to reproduce: {code} gremlin> graph = TinkerFactory.createModern() gremlin> g = graph.traversal().withComputer() gremlin> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum() ==>1.318625 gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum() ==>3.4497 #inital values: gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum() ==>6.0 {code} They fixed the issue by normalising values after each step. The other way to fix is to send the message to it self (stay on the same page). To workaround the problem just add self pointing edges: {code} gremlin>g.V().as('B').addE('knows').from('B') {code} Then you'll get always correct sum. But I'm not sure it is a proper assumption. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] tinkerpop pull request #696: TINKERPOP-1754: Do not add non deserialisable e...
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/696 TINKERPOP-1754: Do not add non deserialisable exception to ScriptReco⦠â¦rdReader IOExceptions You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1754 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/696.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #696 commit 6144dd044b85ae897af39fae5e72524281944c4f Author: artemaliev Date: 2017-08-18T12:21:23Z TINKERPOP-1754: Do not add non deserialisable exception to ScriptRecordReader IOExceptions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (TINKERPOP-1754) Spark can not deserialise some ScriptRecordReader parse exceptions
Artem Aliev created TINKERPOP-1754: -- Summary: Spark can not deserialise some ScriptRecordReader parse exceptions Key: TINKERPOP-1754 URL: https://issues.apache.org/jira/browse/TINKERPOP-1754 Project: TinkerPop Issue Type: Bug Components: hadoop Affects Versions: 3.3.0 Reporter: Artem Aliev Priority: Minor ScriptException refer to groovy exception that could point to "Script" class that is not available for system class loader. Spark can not deserialise the exception and user did not get the parse error. To fix the problem ScriptRecordReader should not try to propagate all cause exceptions abut only the message with parse error. Spark output: {code} WARN [task-result-getter-0] 2017-08-16 11:11:41,777 TaskEndReason.scala:192 - Task exception could not be deserialized java.lang.ClassNotFoundException: Script1 at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_40] at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_40] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) ~[na:1.8.0_40] at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_40] at java.lang.Class.forName0(Native Method) ~[na:1.8.0_40] at java.lang.Class.forName(Class.java:348) ~[na:1.8.0_40] at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) ~[spark-core_2.11-2.2.0.0-bb4c2a9.jar:2.2.0.0-bb4c2a9] at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) [na:1.8.0_40] at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) [na:1.8.0_40] at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1484) [na:1.8.0_40] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1334) [na:1.8.0_40] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) [na:1.8.0_40] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) [na:1.8.0_40] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) [na:1.8.0_40] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) [na:1.8.0_40] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) [na:1.8.0_40] at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501) [na:1.8.0_40] at java.lang.Throwable.readObject(Throwable.java:914) ~[na:1.8.0_40] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_40] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_40] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_40] at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_40] at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) [na:1.8.0_40] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896) [na:1.8.0_40] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) [na:1.8.0_40] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) [na:1.8.0_40] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) [na:1.8.0_40] at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501) [na:1.8.0_40] at java.lang.Throwable.readObject(Throwable.java:914) ~[na:1.8.0_40] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_40] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_40] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_40] at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_40] at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) [na:1.8.0_40] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896) [na:1.8.0_40] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) [na:1.8.0_40] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) [na:1.8.0_40] at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) [na:1.8.0_40] at org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:193) ~[spark-core_2.11-2.2.0.0-bb4c2a9.jar:2.2.0.0-bb4c2a9] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_40] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_40] at sun.reflect.DelegatingMethodAccessorI
[GitHub] tinkerpop pull request #678: TINKERPOP-1715: update spark version to 2.2
Github user artem-aliev commented on a diff in the pull request: https://github.com/apache/tinkerpop/pull/678#discussion_r127710343 --- Diff: gremlin-groovy/pom.xml --- @@ -93,6 +93,10 @@ limitations under the License. ${project.version} test + --- End diff -- Hm, It was needed for build. Now I can not reproduce the failure, so rollback it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1715) Bump to Spark 2.2
[ https://issues.apache.org/jira/browse/TINKERPOP-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16089697#comment-16089697 ] Artem Aliev commented on TINKERPOP-1715: tests passed: {code} spark-gremlin$>mvn clean install -DskipIntegrationTests=false ... [INFO] Results: [INFO] [WARNING] Tests run: 1085, Failures: 0, Errors: 0, Skipped: 133 ... {code} > Bump to Spark 2.2 > - > > Key: TINKERPOP-1715 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1715 > Project: TinkerPop > Issue Type: Improvement > Components: hadoop >Affects Versions: 3.2.5 >Reporter: Marko A. Rodriguez > > Bump to the latest version of Spark. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] tinkerpop pull request #678: TINKERPOP-1715: update spark version to 2.2
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/678 TINKERPOP-1715: update spark version to 2.2 That required: - more spark conflicting dependency exclusion in pom.xml - more spark and scala classes registration in tinkerPop gryo You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1715 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/678.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #678 commit 9e8cc5a1b2893d9aab9da76dee61e3604ce7aadc Author: artemaliev Date: 2017-07-17T10:55:45Z TINKERPOP-1715: update spark version to 2.2 That required: - more spark conflicting dependency exclusion in pom.xml - more spark and scala classes registration in tinkerPop gryo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1271) SparkContext should be restarted if Killed and using Persistent Context
[ https://issues.apache.org/jira/browse/TINKERPOP-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851300#comment-15851300 ] Artem Aliev commented on TINKERPOP-1271: mvn clean install -DskipIntegrationTests=false passed , but I see following out, is it ok? {code} ... Running org.apache.tinkerpop.gremlin.spark.SparkGremlinGryoSerializerTest [ERROR] org.apache.tinkerpop.gremlin.AbstractGremlinSuite - The SparkGremlinSuite will run for this Graph as it is testing a Gremlin flavor but the Graph does not publicly acknowledged it yet with the @OptIn annotation. Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.273 sec - in org.apache.tinkerpop.gremlin.spark.SparkGremlinGryoSerializerTest ... {code} {code} ... [WARN] org.apache.tinkerpop.gremlin.hadoop.groovy.plugin.HadoopGremlinPlugin - Be sure to set the environmental variable: HADOOP_GREMLIN_LIBS [WARN] org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer - /Users/artemaliev/git/tinkerpop.ali/spark-gremlin/target/test-case-data/HadoopGremlinPluginCheck/shouldGracefullyHandleBadGremlinHadoopLibs/ does not reference a valid directory -- proceeding regardless ... {code} > SparkContext should be restarted if Killed and using Persistent Context > --- > > Key: TINKERPOP-1271 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1271 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.0-incubating, 3.1.2-incubating >Reporter: Russell Spitzer > > If the persisted Spark Context is killed by the user via the Spark UI or is > terminated for some other error the Gremlin Console/Server is left with a > stopped Spark Context. This could be caught and the spark context recreated. > Oddly enough if you simply wait the context will "reset" itself or possible > get GC'd out of the system and everything works again. > ##Repo > {code} > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > ==>6 > gremlin> ERROR org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend > - Application has been killed. Reason: Master removed our application: KILLED > ERROR org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on > 10.150.0.180: Remote RPC client disassociated. Likely due to containers > exceeding thresholds, or network issues. Check driver logs for WARN messages. > // Driver has been killed here via the Master UI > gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-gryo.properties') > ==>hadoopgraph[gryoinputformat->gryooutputformat] > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > java.lang.IllegalStateException: Cannot call methods on a stopped > SparkContext. > This stopped SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > The currently active SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > {code} > Full trace from TP > {code} > at > org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106) > at > org.apache.spark.SparkContext$$anonfun$newAPIHadoop
[jira] [Commented] (TINKERPOP-1271) SparkContext should be restarted if Killed and using Persistent Context
[ https://issues.apache.org/jira/browse/TINKERPOP-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15850209#comment-15850209 ] Artem Aliev commented on TINKERPOP-1271: "mvn install" tests passed I have test it manually on master with spark 2.0 and back ported SPARK-19362, to check stop works. > SparkContext should be restarted if Killed and using Persistent Context > --- > > Key: TINKERPOP-1271 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1271 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.0-incubating, 3.1.2-incubating >Reporter: Russell Spitzer > > If the persisted Spark Context is killed by the user via the Spark UI or is > terminated for some other error the Gremlin Console/Server is left with a > stopped Spark Context. This could be caught and the spark context recreated. > Oddly enough if you simply wait the context will "reset" itself or possible > get GC'd out of the system and everything works again. > ##Repo > {code} > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > ==>6 > gremlin> ERROR org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend > - Application has been killed. Reason: Master removed our application: KILLED > ERROR org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on > 10.150.0.180: Remote RPC client disassociated. Likely due to containers > exceeding thresholds, or network issues. Check driver logs for WARN messages. > // Driver has been killed here via the Master UI > gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-gryo.properties') > ==>hadoopgraph[gryoinputformat->gryooutputformat] > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > java.lang.IllegalStateException: Cannot call methods on a stopped > SparkContext. > This stopped SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > The currently active SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > {code} > Full trace from TP > {code} > at > org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106) > at > org.apache.spark.SparkContext$$anonfun$newAPIHadoopRDD$1.apply(SparkContext.scala:1130) > at > org.apache.spark.SparkContext$$anonfun$newAPIHadoopRDD$1.apply(SparkContext.scala:1129) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at > org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:1129) > at > org.apache.spark.api.java.JavaSparkContext.newAPIHadoopRDD(JavaSparkContext.scala:507) > at > org.apache.tinkerpop.gremlin.spark.structure.io.InputFormatRDD.readGraphRDD(InputFormatRDD.java:42) > at > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:
[GitHub] tinkerpop pull request #555: TINKERPOP-1271: Refactor SparkContext creation ...
GitHub user artem-aliev opened a pull request: https://github.com/apache/tinkerpop/pull/555 TINKERPOP-1271: Refactor SparkContext creation and handling of sc.stop() org.apache.tinkerpop.gremlin.spark.structure.Spark is a SparkContext holder for SparkGraphComputer. It was refactored to detect external stop calls and recreate SparkContext in that case. Context creation process was reordered to make all configuration options take effect. Spark.create() methods return created context now. The external stop also requires SPARK-18751 fix, that was integrated into Spark 2.1. By the way the refactoring and configuration loading part gives effect on all versions. You can merge this pull request into a Git repository by running: $ git pull https://github.com/artem-aliev/tinkerpop TINKERPOP-1271 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/555.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #555 commit 6bea8a562a7d2d2a940a5cb7db3f2a4ce09f3dac Author: artemaliev Date: 2017-02-02T12:15:04Z TINKERPOP-1271: Refactor SparkContext creation and handling of external sc.stop() org.apache.tinkerpop.gremlin.spark.structure.Spark is a SparkContext holder for SparkGraphComputer. It was refactored do detect external stop calls and recreate SparkContext in that case. Context creation process was reordered to make all configuration options to take effect. Spark.create() methods return created context now. The external stop also requires SPARK-18751 fix, that was integrated into Spark 2.1. By the way the refactoring and configuration loading part gives effect on all versions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1271) SparkContext should be restarted if Killed and using Persistent Context
[ https://issues.apache.org/jira/browse/TINKERPOP-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837700#comment-15837700 ] Artem Aliev commented on TINKERPOP-1271: I have filed a spark bug for it SPARK-19362 but then find it was fixed with https://issues.apache.org/jira/browse/SPARK-18751 in spark 2.1 > SparkContext should be restarted if Killed and using Persistent Context > --- > > Key: TINKERPOP-1271 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1271 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.0-incubating, 3.1.2-incubating >Reporter: Russell Spitzer > > If the persisted Spark Context is killed by the user via the Spark UI or is > terminated for some other error the Gremlin Console/Server is left with a > stopped Spark Context. This could be caught and the spark context recreated. > Oddly enough if you simply wait the context will "reset" itself or possible > get GC'd out of the system and everything works again. > ##Repo > {code} > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > ==>6 > gremlin> ERROR org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend > - Application has been killed. Reason: Master removed our application: KILLED > ERROR org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on > 10.150.0.180: Remote RPC client disassociated. Likely due to containers > exceeding thresholds, or network issues. Check driver logs for WARN messages. > // Driver has been killed here via the Master UI > gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-gryo.properties') > ==>hadoopgraph[gryoinputformat->gryooutputformat] > gremlin> g.V().count() > WARN org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer > - HADOOP_GREMLIN_LIBS is not set -- proceeding regardless > java.lang.IllegalStateException: Cannot call methods on a stopped > SparkContext. > This stopped SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > The currently active SparkContext was created at: > org.apache.spark.SparkContext.getOrCreate(SparkContext.scala) > org.apache.tinkerpop.gremlin.spark.structure.Spark.create(Spark.java:53) > org.apache.tinkerpop.gremlin.spark.structure.io.SparkContextStorage.open(SparkContextStorage.java:60) > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraphComputer.java:122) > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) > {code} > Full trace from TP > {code} > at > org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:106) > at > org.apache.spark.SparkContext$$anonfun$newAPIHadoopRDD$1.apply(SparkContext.scala:1130) > at > org.apache.spark.SparkContext$$anonfun$newAPIHadoopRDD$1.apply(SparkContext.scala:1129) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at > org.apache.spark.SparkContext.newAPIHadoopRDD(SparkContext.scala:1129) > at > org.apache.spark.api.java.JavaSparkContext.newAPIHadoopRDD(JavaSparkContext.scala:507) > at > org.apache.tinkerpop.gremlin.spark.structure.io.InputFormatRDD.readGraphRDD(InputFormatRDD.java:42) > at > org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer.lambda$submitWithExecutor$1(SparkGraph