[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603693#comment-15603693 ] ASF GitHub Bot commented on TINKERPOP-1525: --- GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/462 TINKERPOP-1525 Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-tp32 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/462.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #462 commit 1fc6ec5d7c20d6163c61f84198e3e68457e23eae Author: Dan LaRocqueDate: 2016-10-21T21:04:30Z Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug >Affects Versions: 3.2.3, 3.3.0 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop pull request #462: TINKERPOP-1525 Avoid starting VP worker iterati...
GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/462 TINKERPOP-1525 Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-tp32 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/462.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #462 commit 1fc6ec5d7c20d6163c61f84198e3e68457e23eae Author: Dan LaRocqueDate: 2016-10-21T21:04:30Z Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-932) Add ability to cancel script execution associated with a Gremlin Server Session
[ https://issues.apache.org/jira/browse/TINKERPOP-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603167#comment-15603167 ] ASF GitHub Bot commented on TINKERPOP-932: -- GitHub user spmallette opened a pull request: https://github.com/apache/tinkerpop/pull/461 TINKERPOP-932 Added "force" option on session close. https://issues.apache.org/jira/browse/TINKERPOP-932 Allows session close requests to interrupt a long run job at the cost of not closing transactions. Tested with: `mvn clean install && mvn verify -pl gremlin-server -DskipIntegrationTests=false -DincludeNeo4j` VOTE +1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-932 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/461.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #461 > Add ability to cancel script execution associated with a Gremlin Server > Session > > > Key: TINKERPOP-932 > URL: https://issues.apache.org/jira/browse/TINKERPOP-932 > Project: TinkerPop > Issue Type: Improvement > Components: server >Affects Versions: 3.0.2-incubating >Reporter: Zachary Kurey >Assignee: stephen mallette > Fix For: 3.2.4 > > > Currently with a {{SessionedClient}} there is no way to cancel a long running > script and the client has to depend on Gremlin Server side configured > timeouts before they can execute another script associated with the same > session id. > There is a way we can forcefully close a session from the client side, or > just close the entire Gremlin client. But it would be useful for client side > applications to be able to cancel script execution, have its intermediate > effects rolled back, and be able to continue interacting with the session > without losing session variable state maintained on the Gremlin server side. > Unsure where this should live at an API level, since canceling by session id > isn't relevant for all {{Client}} implementations. If somehow when the > {{CompletableFuture}} returned by {{Client.submitAsync}} could do > this when the {{Future}} is canceled, that would be a nice way to bridge > implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop pull request #461: TINKERPOP-932 Added "force" option on session c...
GitHub user spmallette opened a pull request: https://github.com/apache/tinkerpop/pull/461 TINKERPOP-932 Added "force" option on session close. https://issues.apache.org/jira/browse/TINKERPOP-932 Allows session close requests to interrupt a long run job at the cost of not closing transactions. Tested with: `mvn clean install && mvn verify -pl gremlin-server -DskipIntegrationTests=false -DincludeNeo4j` VOTE +1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/tinkerpop TINKERPOP-932 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/461.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #461 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: PathRetractionStrategy and TraverserRequirement.PATH
Ok apologies. I thought I spotted the difference and simplified the gremlin too much to highlight what I thought I saw. The above mentioned queries are returning the same result in Sqlg as TinkerGraph. Here is what is not working. final TinkerGraph g = TinkerFactory.createModern(); GraphTraversal> traversal = g.traversal() .V().as("a") .repeat(both()).times(3).emit().as("b") . group().by(select("a")); printTraversalForm(traversal); while (traversal.hasNext()) { Map vertexMap = traversal.next(); for (Vertex vertex : vertexMap.keySet()) { Collection coll = vertexMap.get(vertex); System.out.println("key: " + vertex.value("name") + ", value: " + coll.size()); } } For this Sqlg has the same result as TinkerGraph. TinkerGraph post-strategy:[TinkerGraphStep(vertex,[])@[a], RepeatStep([VertexStep(BOTH,vertex), RepeatEndStep],until(loops(3)),emit(true))@[b], GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[FoldStep])] Sqlg post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel], GroupStep([SelectOneStep(a)],[FoldStep])] key: marko, value: 27 key: vadas, value: 11 key: lop, value: 27 key: josh, value: 27 key: ripple, value: 11 key: peter, value: 11 Adding in the extra by() final TinkerGraph g = TinkerFactory.createModern(); GraphTraversal > traversal = g.traversal() .V().as("a") .repeat(both()).times(3).emit().as("b") . group().by(select("a")) .by(select("b").dedup().order().by(T.id).fold()); printTraversalForm(traversal); while (traversal.hasNext()) { Map vertexMap = traversal.next(); for (Vertex vertex : vertexMap.keySet()) { Collection coll = vertexMap.get(vertex); System.out.println("key: " + vertex.value("name") + ", value: " + coll.size()); } } TinkerGraph prints post-strategy:[TinkerGraphStep(vertex,[])@[a], RepeatStep([VertexStep(BOTH,vertex), RepeatEndStep],until(loops(3)),emit(true))@[b], GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[SelectOneStep(b), DedupGlobalStep, OrderGlobalStep([[id, incr]]), FoldStep])] key: marko, value: 6 key: vadas, value: 6 key: lop, value: 6 key: josh, value: 6 key: ripple, value: 6 key: peter, value: 6 and Sqlg post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel], GroupStep([SelectOneStep(a)],[SelectOneStep(b), DedupGlobalStep, OrderGlobalStep([[id, incr]]), FoldStep])] key: marko, value: 0 key: ripple, value: 0 key: peter, value: 0 key: lop, value: 0 key: josh, value: 0 key: vadas, value: 0 The difference being the NoOpBarrierStep but I am not sure if that is the culprit or not. Thanks Pieter On 24/10/2016 21:31, Marko Rodriguez wrote: > Hi Pieter, > > What are the two answers --- TinkerGraph and Sqlg for the respective test > traversal? > > (I suspect the test is bad because group() pushes traversers through with > bulks and all so the test might just add to a collection without adding > respecting bulks. Probably should change that test regardless to do like a > count or something instead). > > Marko. > > http://markorodriguez.com > > > >> On Oct 24, 2016, at 12:57 PM, pieter-gmail wrote: >> >> Hi, >> >> This is on 3.2.3 >> >> I have been investigating why >> `DedupTest.g_V_asXaX_repeatXbothX_timesX3X_emit_asXbX_group_byXselectXaXX_byXselectXbX_dedup_order_byXidX_foldX_selectXvaluesX_unfold_dedup` >> fails on Sqlg. It is a fairly recently added test. >> >> My investigation so far has narrowed the problem to the >> `PathRetractionStrategy` >> >> On the modern graph, >> >>GraphTraversal > >> traversal = g.traversal() >>.V().as("a") >>.out().as("b") >>. group().by(select("a")) >>.by(select("b")); >>printTraversalForm(traversal); >> >> Outputs the following on TinkerGraph >> >> pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b], >> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] >> post-strategy:[TinkerGraphStep(vertex,[])@[a], >> VertexStep(OUT,vertex)@[b], GroupStep([SelectOneStep(a), >> NoOpBarrierStep(2500)],[SelectOneStep(b), NoOpBarrierStep(2500)])] >> >> And on Sqlg >> pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b], >> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] >> post-strategy:[SqlgGraphStepCompiled(vertex,[])@[b], >> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] >> >> The difference being that Sqlg does not have the `NoOpBarrierStep` inserted. >> >> For TinkerGraph the `NoOpBarrierStep` is being inserted in the >> `PathRetractionStrategy`
Re: PathRetractionStrategy and TraverserRequirement.PATH
Hi Pieter, What are the two answers --- TinkerGraph and Sqlg for the respective test traversal? (I suspect the test is bad because group() pushes traversers through with bulks and all so the test might just add to a collection without adding respecting bulks. Probably should change that test regardless to do like a count or something instead). Marko. http://markorodriguez.com > On Oct 24, 2016, at 12:57 PM, pieter-gmailwrote: > > Hi, > > This is on 3.2.3 > > I have been investigating why > `DedupTest.g_V_asXaX_repeatXbothX_timesX3X_emit_asXbX_group_byXselectXaXX_byXselectXbX_dedup_order_byXidX_foldX_selectXvaluesX_unfold_dedup` > fails on Sqlg. It is a fairly recently added test. > > My investigation so far has narrowed the problem to the > `PathRetractionStrategy` > > On the modern graph, > >GraphTraversal > > traversal = g.traversal() >.V().as("a") >.out().as("b") >. group().by(select("a")) >.by(select("b")); >printTraversalForm(traversal); > > Outputs the following on TinkerGraph > > pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b], > GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] > post-strategy:[TinkerGraphStep(vertex,[])@[a], > VertexStep(OUT,vertex)@[b], GroupStep([SelectOneStep(a), > NoOpBarrierStep(2500)],[SelectOneStep(b), NoOpBarrierStep(2500)])] > > And on Sqlg > pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b], > GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] > post-strategy:[SqlgGraphStepCompiled(vertex,[])@[b], > GroupStep([SelectOneStep(a)],[SelectOneStep(b)])] > > The difference being that Sqlg does not have the `NoOpBarrierStep` inserted. > > For TinkerGraph the `NoOpBarrierStep` is being inserted in the > `PathRetractionStrategy` on line 113 > However this does not happen for Sqlg as Sqlg's GraphStep has > `TraverRequirement.PATH` as a requirement which prevents > `PathRetractionStrategy` from doing what it does. > > Is this a bug of sorts? Should Sqlg be adding in the `NoOpBarrierStep`? > > Thanks > Pieter
[jira] [Commented] (TINKERPOP-1474) API Alignment Between Java Gremlin Graph Structure and GLVs
[ https://issues.apache.org/jira/browse/TINKERPOP-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15601813#comment-15601813 ] Andy Tolbert commented on TINKERPOP-1474: - I think that is a key point that isn't really clear to me. So is deserializing gremlin bytecode into objects (vertex, edge, vertexproperty, etc.) out of scope of gremlin-python then? > API Alignment Between Java Gremlin Graph Structure and GLVs > --- > > Key: TINKERPOP-1474 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1474 > Project: TinkerPop > Issue Type: Bug > Components: io >Affects Versions: 3.2.2 >Reporter: Adam Holmberg > > The current Java GraphSON implementation and that in the Python GLV leave > some question about what *should* be returned from a simple traversal like > `g.V()`. > The java implementation presently assumes that properties could be present > and returns a DetachedVertex: > https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONSerializersV2d0.java#L420-L433 > The python implementation assumes no such thing and returns something more > reminiscent of a ReferenceVertex: > https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/structure/io/graphson.py#L238-L242 > Is the java version overreaching, and should not expect properties unless a > step calls for them (e.g. ` g.V().valueMap()` or `g.V().values('name')`, or > should the Python version be expanded? > Is there something we can do to establish guidelines for this, and align > these APIs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)