[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions

2016-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603693#comment-15603693
 ] 

ASF GitHub Bot commented on TINKERPOP-1525:
---

GitHub user dalaro opened a pull request:

https://github.com/apache/tinkerpop/pull/462

TINKERPOP-1525 Avoid starting VP worker iterations that never end

SparkExecutor.executeVertexProgramIteration was written in such a way
that an empty RDD partition would cause it to invoke
VertexProgram.workerIterationStart without ever invoking
VertexProgram.workerIterationEnd.  This seems like a contract
violation.  I have at least one VP that relies on
workerIterationStart|End to allocate and release resources.  Failing
to invoke End like this causes a leak in that VP, as it would for any
VP that uses that resource management pattern.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-tp32

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #462


commit 1fc6ec5d7c20d6163c61f84198e3e68457e23eae
Author: Dan LaRocque 
Date:   2016-10-21T21:04:30Z

Avoid starting VP worker iterations that never end

SparkExecutor.executeVertexProgramIteration was written in such a way
that an empty RDD partition would cause it to invoke
VertexProgram.workerIterationStart without ever invoking
VertexProgram.workerIterationEnd.  This seems like a contract
violation.  I have at least one VP that relies on
workerIterationStart|End to allocate and release resources.  Failing
to invoke End like this causes a leak in that VP, as it would for any
VP that uses that resource management pattern.




> Plug VertexProgram iteration leak on empty Spark RDD partitions
> ---
>
> Key: TINKERPOP-1525
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1525
> Project: TinkerPop
>  Issue Type: Bug
>Affects Versions: 3.2.3, 3.3.0
>Reporter: Dan LaRocque
>
> If SparkExecutor gets an RDD with empty partitions, it can invoke 
> {{VertexProgram.workerIterationStart}} without ever invoking 
> {{VertexProgram.workerIterationEnd}}.
> For vertex programs that allocate and release meaningful resources in the 
> start/end methods, this can lead to resource leaks.
> I already tested a fix that I made against the 3.2 series.  I will submit PRs 
> momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] tinkerpop pull request #462: TINKERPOP-1525 Avoid starting VP worker iterati...

2016-10-24 Thread dalaro
GitHub user dalaro opened a pull request:

https://github.com/apache/tinkerpop/pull/462

TINKERPOP-1525 Avoid starting VP worker iterations that never end

SparkExecutor.executeVertexProgramIteration was written in such a way
that an empty RDD partition would cause it to invoke
VertexProgram.workerIterationStart without ever invoking
VertexProgram.workerIterationEnd.  This seems like a contract
violation.  I have at least one VP that relies on
workerIterationStart|End to allocate and release resources.  Failing
to invoke End like this causes a leak in that VP, as it would for any
VP that uses that resource management pattern.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-tp32

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #462


commit 1fc6ec5d7c20d6163c61f84198e3e68457e23eae
Author: Dan LaRocque 
Date:   2016-10-21T21:04:30Z

Avoid starting VP worker iterations that never end

SparkExecutor.executeVertexProgramIteration was written in such a way
that an empty RDD partition would cause it to invoke
VertexProgram.workerIterationStart without ever invoking
VertexProgram.workerIterationEnd.  This seems like a contract
violation.  I have at least one VP that relies on
workerIterationStart|End to allocate and release resources.  Failing
to invoke End like this causes a leak in that VP, as it would for any
VP that uses that resource management pattern.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (TINKERPOP-932) Add ability to cancel script execution associated with a Gremlin Server Session

2016-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603167#comment-15603167
 ] 

ASF GitHub Bot commented on TINKERPOP-932:
--

GitHub user spmallette opened a pull request:

https://github.com/apache/tinkerpop/pull/461

TINKERPOP-932 Added "force" option on session close.

https://issues.apache.org/jira/browse/TINKERPOP-932

Allows session close requests to interrupt a long run job at the cost of 
not closing transactions.

Tested with: `mvn clean install && mvn verify -pl gremlin-server 
-DskipIntegrationTests=false -DincludeNeo4j`

VOTE +1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/tinkerpop TINKERPOP-932

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #461






> Add ability to cancel script execution associated with a Gremlin Server 
> Session 
> 
>
> Key: TINKERPOP-932
> URL: https://issues.apache.org/jira/browse/TINKERPOP-932
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.0.2-incubating
>Reporter: Zachary Kurey
>Assignee: stephen mallette
> Fix For: 3.2.4
>
>
> Currently with a {{SessionedClient}} there is no way to cancel a long running 
> script and the client has to depend on Gremlin Server side configured 
> timeouts before they can execute another script associated with the same 
> session id.
> There is a way we can forcefully close a session from the client side, or 
> just close the entire Gremlin client.  But it would be useful for client side 
> applications to be able to cancel script execution, have its intermediate 
> effects rolled back, and be able to continue interacting with the session 
> without losing session variable state maintained on the Gremlin server side.
> Unsure where this should live at an API level, since canceling by session id 
> isn't relevant for all {{Client}} implementations.  If somehow when the 
> {{CompletableFuture}} returned by {{Client.submitAsync}} could do 
> this when the {{Future}} is canceled, that would be a nice way to bridge 
> implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] tinkerpop pull request #461: TINKERPOP-932 Added "force" option on session c...

2016-10-24 Thread spmallette
GitHub user spmallette opened a pull request:

https://github.com/apache/tinkerpop/pull/461

TINKERPOP-932 Added "force" option on session close.

https://issues.apache.org/jira/browse/TINKERPOP-932

Allows session close requests to interrupt a long run job at the cost of 
not closing transactions.

Tested with: `mvn clean install && mvn verify -pl gremlin-server 
-DskipIntegrationTests=false -DincludeNeo4j`

VOTE +1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/tinkerpop TINKERPOP-932

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #461






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: PathRetractionStrategy and TraverserRequirement.PATH

2016-10-24 Thread pieter-gmail
Ok apologies. I thought I spotted the difference and simplified the
gremlin too much to highlight what I thought I saw. The above mentioned
queries are returning the same result in Sqlg as TinkerGraph.

Here is what is not working.

final TinkerGraph g = TinkerFactory.createModern();
GraphTraversal>
traversal = g.traversal()
.V().as("a")
.repeat(both()).times(3).emit().as("b")
.group().by(select("a"));
printTraversalForm(traversal);
while (traversal.hasNext()) {
Map vertexMap = traversal.next();
for (Vertex vertex : vertexMap.keySet()) {
Collection coll = vertexMap.get(vertex);
System.out.println("key: " + vertex.value("name") + ",
value: " + coll.size());
}
}

For this Sqlg has the same result as TinkerGraph.

TinkerGraph

post-strategy:[TinkerGraphStep(vertex,[])@[a],
RepeatStep([VertexStep(BOTH,vertex),
RepeatEndStep],until(loops(3)),emit(true))@[b],
GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[FoldStep])]

Sqlg

post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
GroupStep([SelectOneStep(a)],[FoldStep])]

key: marko, value: 27
key: vadas, value: 11
key: lop, value: 27
key: josh, value: 27
key: ripple, value: 11
key: peter, value: 11

Adding in the extra by()

final TinkerGraph g = TinkerFactory.createModern();
GraphTraversal>
traversal = g.traversal()
.V().as("a")
.repeat(both()).times(3).emit().as("b")
.group().by(select("a"))
.by(select("b").dedup().order().by(T.id).fold());
printTraversalForm(traversal);
while (traversal.hasNext()) {
Map vertexMap = traversal.next();
for (Vertex vertex : vertexMap.keySet()) {
Collection coll = vertexMap.get(vertex);
System.out.println("key: " + vertex.value("name") + ",
value: " + coll.size());
}
}

TinkerGraph prints

post-strategy:[TinkerGraphStep(vertex,[])@[a],
RepeatStep([VertexStep(BOTH,vertex),
RepeatEndStep],until(loops(3)),emit(true))@[b],
GroupStep([SelectOneStep(a), NoOpBarrierStep(2500)],[SelectOneStep(b),
DedupGlobalStep, OrderGlobalStep([[id, incr]]), FoldStep])]

key: marko, value: 6
key: vadas, value: 6
key: lop, value: 6
key: josh, value: 6
key: ripple, value: 6
key: peter, value: 6

and Sqlg

post-strategy:[SqlgGraphStepCompiled(vertex,[])@[sqlgPathFakeLabel],
GroupStep([SelectOneStep(a)],[SelectOneStep(b), DedupGlobalStep,
OrderGlobalStep([[id, incr]]), FoldStep])]

key: marko, value: 0
key: ripple, value: 0
key: peter, value: 0
key: lop, value: 0
key: josh, value: 0
key: vadas, value: 0

The difference being the NoOpBarrierStep but I am not sure if that is
the culprit or not.

Thanks
Pieter






On 24/10/2016 21:31, Marko Rodriguez wrote:
> Hi Pieter,
>
> What are the two answers --- TinkerGraph and Sqlg for the respective test 
> traversal?
>
> (I suspect the test is bad because group() pushes traversers through with 
> bulks and all so the test might just add to a collection without adding 
> respecting bulks. Probably should change that test regardless to do like a 
> count or something instead).
>
> Marko.
>
> http://markorodriguez.com
>
>
>
>> On Oct 24, 2016, at 12:57 PM, pieter-gmail  wrote:
>>
>> Hi,
>>
>> This is on 3.2.3
>>
>> I have been investigating why
>> `DedupTest.g_V_asXaX_repeatXbothX_timesX3X_emit_asXbX_group_byXselectXaXX_byXselectXbX_dedup_order_byXidX_foldX_selectXvaluesX_unfold_dedup`
>> fails on Sqlg. It is a fairly recently added test.
>>
>> My investigation so far has narrowed the problem to the
>> `PathRetractionStrategy`
>>
>> On the modern graph,
>>
>>GraphTraversal>
>> traversal = g.traversal()
>>.V().as("a")
>>.out().as("b")
>>.group().by(select("a"))
>>.by(select("b"));
>>printTraversalForm(traversal);
>>
>> Outputs the following on TinkerGraph
>>
>> pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b],
>> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>>  post-strategy:[TinkerGraphStep(vertex,[])@[a],
>> VertexStep(OUT,vertex)@[b], GroupStep([SelectOneStep(a),
>> NoOpBarrierStep(2500)],[SelectOneStep(b), NoOpBarrierStep(2500)])]
>>
>> And on Sqlg
>>   pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b],
>> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>>  post-strategy:[SqlgGraphStepCompiled(vertex,[])@[b],
>> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>>
>> The difference being that Sqlg does not have the `NoOpBarrierStep` inserted.
>>
>> For TinkerGraph the `NoOpBarrierStep` is being inserted in the
>> `PathRetractionStrategy` 

Re: PathRetractionStrategy and TraverserRequirement.PATH

2016-10-24 Thread Marko Rodriguez
Hi Pieter,

What are the two answers --- TinkerGraph and Sqlg for the respective test 
traversal?

(I suspect the test is bad because group() pushes traversers through with bulks 
and all so the test might just add to a collection without adding respecting 
bulks. Probably should change that test regardless to do like a count or 
something instead).

Marko.

http://markorodriguez.com



> On Oct 24, 2016, at 12:57 PM, pieter-gmail  wrote:
> 
> Hi,
> 
> This is on 3.2.3
> 
> I have been investigating why
> `DedupTest.g_V_asXaX_repeatXbothX_timesX3X_emit_asXbX_group_byXselectXaXX_byXselectXbX_dedup_order_byXidX_foldX_selectXvaluesX_unfold_dedup`
> fails on Sqlg. It is a fairly recently added test.
> 
> My investigation so far has narrowed the problem to the
> `PathRetractionStrategy`
> 
> On the modern graph,
> 
>GraphTraversal>
> traversal = g.traversal()
>.V().as("a")
>.out().as("b")
>.group().by(select("a"))
>.by(select("b"));
>printTraversalForm(traversal);
> 
> Outputs the following on TinkerGraph
> 
> pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b],
> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>  post-strategy:[TinkerGraphStep(vertex,[])@[a],
> VertexStep(OUT,vertex)@[b], GroupStep([SelectOneStep(a),
> NoOpBarrierStep(2500)],[SelectOneStep(b), NoOpBarrierStep(2500)])]
> 
> And on Sqlg
>   pre-strategy:[GraphStep(vertex,[])@[a], VertexStep(OUT,vertex)@[b],
> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
>  post-strategy:[SqlgGraphStepCompiled(vertex,[])@[b],
> GroupStep([SelectOneStep(a)],[SelectOneStep(b)])]
> 
> The difference being that Sqlg does not have the `NoOpBarrierStep` inserted.
> 
> For TinkerGraph the `NoOpBarrierStep` is being inserted in the
> `PathRetractionStrategy` on line 113
> However this does not happen for Sqlg as Sqlg's GraphStep has
> `TraverRequirement.PATH` as a requirement which prevents
> `PathRetractionStrategy` from doing what it does.
> 
> Is this a bug of sorts? Should Sqlg be adding in the `NoOpBarrierStep`?
> 
> Thanks
> Pieter



[jira] [Commented] (TINKERPOP-1474) API Alignment Between Java Gremlin Graph Structure and GLVs

2016-10-24 Thread Andy Tolbert (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15601813#comment-15601813
 ] 

Andy Tolbert commented on TINKERPOP-1474:
-

I think that is a key point that isn't really clear to me.  So is deserializing 
gremlin bytecode into objects (vertex, edge, vertexproperty, etc.) out of scope 
of gremlin-python then?

> API Alignment Between Java Gremlin Graph Structure and GLVs
> ---
>
> Key: TINKERPOP-1474
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1474
> Project: TinkerPop
>  Issue Type: Bug
>  Components: io
>Affects Versions: 3.2.2
>Reporter: Adam Holmberg
>
> The current Java GraphSON implementation and that in the Python GLV leave 
> some question about what *should* be returned from a simple traversal like 
> `g.V()`.
> The java implementation presently assumes that properties could be present 
> and returns a DetachedVertex: 
> https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONSerializersV2d0.java#L420-L433
> The python implementation assumes no such thing and returns something more 
> reminiscent of a ReferenceVertex: 
> https://github.com/apache/tinkerpop/blob/master/gremlin-python/src/main/jython/gremlin_python/structure/io/graphson.py#L238-L242
> Is the java version overreaching, and should not expect properties unless a 
> step calls for them (e.g. ` g.V().valueMap()` or `g.V().values('name')`, or 
> should the Python version be expanded?
> Is there something we can do to establish guidelines for this, and align 
> these APIs?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)