[jira] [Commented] (TINKERPOP-1616) Strengthen semantics around lazy iteration and graph modifications
[ https://issues.apache.org/jira/browse/TINKERPOP-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16384549#comment-16384549 ] pieter martin commented on TINKERPOP-1616: -- I do have any {{OptOut}} tests for this as such. The test that originally highlighted the issue is {{EdgeTest.shouldNotHaveAConcurrentModificationExceptionWhenIteratingAndRemovingAddingEdges}} However the issue remains. TinkerPop is encouraging using inline {{addV}} and {{addE}} but as {{gremlin}} does not specify iteration order nor iteration depth/laziness a user will never know exactly what the result will be. I do not think this can be left to provider documentation. The language needs to be exactly specified. The same query can not have different outcomes on different providers. {code:java} @Test public void testLazy1AddE() { final TinkerGraph g = TinkerGraph.open(); final Vertex a1 = g.addVertex(T.label, "A"); final Vertex b1 = g.addVertex(T.label, "B"); final Vertex c1 = g.addVertex(T.label, "C"); a1.addEdge("ab", b1); a1.addEdge("ac", c1); GraphTraversal t = g.traversal().V(a1).both().addE("ab").from(a1).to(b1); List edges = t.toList(); System.out.println(edges.size()); } {code} Here is the same example again but this time using inline {{addE}} For TinkerGraph the result is 2. This is bacause TinkerGraph's implementation behaves as though there is a {{barrier}} step before the {{addE}}. For Sqlg the result is 3. Sqlg is more lazy than TinkerGraph and by the time the second {{both}} iteration happens it reads the previously added {{edge}}. Perhaps the default should be to add in a {{barrier}} step to all modification steps? > Strengthen semantics around lazy iteration and graph modifications > -- > > Key: TINKERPOP-1616 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1616 > Project: TinkerPop > Issue Type: Improvement > Components: structure >Affects Versions: 3.2.3 >Reporter: pieter martin >Assignee: stephen mallette >Priority: Major > > The investigation started with the a bothE query where Sqlg returned > different results to TinkerGraph > {code} > @Test > public void testLazy1() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).bothE().forEachRemaining(edge -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > For this query TinkerGraph returns 2 and passes. > Sqlg however returns 3. The reason being that it lazily iterates the out() > first and then the in(). > The following gremlin is the same but using a union(out(), in()) instead of > bothE() > {code} > @Test > public void testLazy2() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).union(__.outE(), __.inE()).forEachRemaining(edge > -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > In this case TinkerGraph returns 4 and Sqlg 6 > TinkerGraph returns 4 as it first walks the 2 out edges and adds 2 in edges > which it sees when traversing the in(). > Sqlg return 6 as it lazier than TinkerGraph. > It first walks the "ac" out edge and adds in the 2 edges. > Then walks "ab" and gets 2 edges. The original and the one added previously. > It then walks "ac" in and gets 3 edges as 3 has been added so far. > All and all 6. > I am not sure whats the expected semantics. Sqlg is lazier than TinkerGraph > but not completely lazy either as it depends more on the meta data and number > of queries it needs to execute to walk a particular gremlin query. > I am somewhat of the opinion that without enforcing a eager iteration when > modifying a graph the semantics will be different for different implementors. > For Sqlg at least it will be hard for clients to predict the behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (TINKERPOP-1616) Strengthen semantics around lazy iteration and graph modifications
[ https://issues.apache.org/jira/browse/TINKERPOP-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16052000#comment-16052000 ] Daniel Kuppitz commented on TINKERPOP-1616: --- I think we had this discussion on the list before. Or Slack? Dunno.. Anyway, I don't think we can say that one result is right and the other is wrong. Maybe {{LazyEvaluation}} should be a graph feature..? > Strengthen semantics around lazy iteration and graph modifications > -- > > Key: TINKERPOP-1616 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1616 > Project: TinkerPop > Issue Type: Improvement > Components: structure >Affects Versions: 3.2.3 >Reporter: pieter martin > > The investigation started with the a bothE query where Sqlg returned > different results to TinkerGraph > {code} > @Test > public void testLazy1() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).bothE().forEachRemaining(edge -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > For this query TinkerGraph returns 2 and passes. > Sqlg however returns 3. The reason being that it lazily iterates the out() > first and then the in(). > The following gremlin is the same but using a union(out(), in()) instead of > bothE() > {code} > @Test > public void testLazy2() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).union(__.outE(), __.inE()).forEachRemaining(edge > -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > In this case TinkerGraph returns 4 and Sqlg 6 > TinkerGraph returns 4 as it first walks the 2 out edges and adds 2 in edges > which it sees when traversing the in(). > Sqlg return 6 as it lazier than TinkerGraph. > It first walks the "ac" out edge and adds in the 2 edges. > Then walks "ab" and gets 2 edges. The original and the one added previously. > It then walks "ac" in and gets 3 edges as 3 has been added so far. > All and all 6. > I am not sure whats the expected semantics. Sqlg is lazier than TinkerGraph > but not completely lazy either as it depends more on the meta data and number > of queries it needs to execute to walk a particular gremlin query. > I am somewhat of the opinion that without enforcing a eager iteration when > modifying a graph the semantics will be different for different implementors. > For Sqlg at least it will be hard for clients to predict the behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TINKERPOP-1616) Strengthen semantics around lazy iteration and graph modifications
[ https://issues.apache.org/jira/browse/TINKERPOP-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051926#comment-16051926 ] stephen mallette commented on TINKERPOP-1616: - [~pietermartin] i just noticed this issue. i tend to feel like TinkerGraph has what I would expect as a user. What I add as side-efffects probably shouldn't reflect in the output. cc/ [~dkuppitz] > Strengthen semantics around lazy iteration and graph modifications > -- > > Key: TINKERPOP-1616 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1616 > Project: TinkerPop > Issue Type: Improvement > Components: structure >Affects Versions: 3.2.3 >Reporter: pieter martin > > The investigation started with the a bothE query where Sqlg returned > different results to TinkerGraph > {code} > @Test > public void testLazy1() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).bothE().forEachRemaining(edge -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > For this query TinkerGraph returns 2 and passes. > Sqlg however returns 3. The reason being that it lazily iterates the out() > first and then the in(). > The following gremlin is the same but using a union(out(), in()) instead of > bothE() > {code} > @Test > public void testLazy2() { > final TinkerGraph graph = TinkerGraph.open(); > final Vertex a1 = graph.addVertex(T.label, "A"); > final Vertex b1 = graph.addVertex(T.label, "B"); > final Vertex c1 = graph.addVertex(T.label, "C"); > a1.addEdge("ab", b1); > a1.addEdge("ac", c1); > AtomicInteger count = new AtomicInteger(0); > graph.traversal().V(a1).union(__.outE(), __.inE()).forEachRemaining(edge > -> { > a1.addEdge("ab", b1); > c1.addEdge("ac", a1); > count.getAndIncrement(); > }); > Assert.assertEquals(2, count.get()); > } > {code} > In this case TinkerGraph returns 4 and Sqlg 6 > TinkerGraph returns 4 as it first walks the 2 out edges and adds 2 in edges > which it sees when traversing the in(). > Sqlg return 6 as it lazier than TinkerGraph. > It first walks the "ac" out edge and adds in the 2 edges. > Then walks "ab" and gets 2 edges. The original and the one added previously. > It then walks "ac" in and gets 3 edges as 3 has been added so far. > All and all 6. > I am not sure whats the expected semantics. Sqlg is lazier than TinkerGraph > but not completely lazy either as it depends more on the meta data and number > of queries it needs to execute to walk a particular gremlin query. > I am somewhat of the opinion that without enforcing a eager iteration when > modifying a graph the semantics will be different for different implementors. > For Sqlg at least it will be hard for clients to predict the behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)