[ 
https://issues.apache.org/jira/browse/TINKERPOP-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715490#comment-17715490
 ] 

Oleksandr Porunov edited comment on TINKERPOP-2490 at 4/23/23 4:25 PM:
-----------------------------------------------------------------------

This bug still exists in TinkerPop (checked version 3.6.2). An example test 
which shows when this behavior is not expected is below:
{code:java}
@Test
public void testLimitedFilterNotChecksElementsOverTheLimit(){
    TinkerGraph tinkerGraph = TinkerGraph.open();
    List<Vertex> vertices = new ArrayList<>();
    for(int i=0; i<3; i++){
        Vertex v1 = tinkerGraph.addVertex();
        Vertex v2 = tinkerGraph.addVertex();
        v1.addEdge("connects", v2);
        vertices.add(v1);
    }
    tinkerGraph.traversal().V(vertices.get(0), vertices.get(1), vertices.get(2))
        .where(__.out("connects").count().is(P.gte(1))).limit(1).toList();

    TraversalMetrics traversalMetrics = tinkerGraph.traversal()
        .V(vertices.get(0), vertices.get(1), vertices.get(2))
        .where(__.out("connects").count().is(P.gte(1)))
        .limit(1)
        .profile().next();

    Long filterTraversalCount = 
traversalMetrics.getMetrics().stream().filter(metrics -> 
metrics.getName().startsWith(TraversalFilterStep.class.getSimpleName()))
        .findFirst().get().getCount(TraversalMetrics.TRAVERSER_COUNT_ID);

    Assertions.assertEquals(1, filterTraversalCount);
} {code}
In the above test the filter first checks the first provided vertex (which runs 
some potentially expensive check) then when limit is satisfied the check is 
executed again for the second provided vertex (even so it's unnecessary). The 
3rd vertex is not evaluated.

Thus, basically, we execute this computation always for 1 more unnecessary 
vertex.

The above assertEquals will throw:

 
{code:java}
org.opentest4j.AssertionFailedError: 
Expected :1
Actual   :2 {code}
 

 


was (Author: porunov):
This bug still exists in TinkerPop (checked version 3.6.2). An example test 
which shows when this behavior is not expected is below:
{code:java}
@Test
public void testLimitedFilterNotChecksElementsOverTheLimit(){
    TinkerGraph tinkerGraph = TinkerGraph.open();
    List<Vertex> vertices = new ArrayList<>();
    for(int i=0; i<3; i++){
        Vertex v1 = tinkerGraph.addVertex();
        Vertex v2 = tinkerGraph.addVertex();
        v1.addEdge("connects", v2);
        vertices.add(v1);
    }
    tinkerGraph.traversal().V(vertices.get(0), vertices.get(1), vertices.get(2))
        .where(__.out("connects").count().is(P.gte(1))).limit(1).toList();

    TraversalMetrics traversalMetrics = tinkerGraph.traversal()
        .V(vertices.get(0), vertices.get(1), vertices.get(2))
        .where(__.out("connects").count().is(P.gte(1)))
        .limit(1)
        .profile().next();

    Long filterTraversalCount = 
traversalMetrics.getMetrics().stream().filter(metrics -> 
metrics.getName().startsWith(TraversalFilterStep.class.getSimpleName()))
        .findFirst().get().getCount(TraversalMetrics.TRAVERSER_COUNT_ID);

    Assertions.assertEquals(1, filterTraversalCount);
} {code}
In the above test the filter first checks the first provided vertex (which runs 
some potentially expensive check) then when limit is satisfied the check is 
executed again for the second provided vertex (even so it's unnecessary). The 
3rd vertex is not evaluated.

Thus, basically, we execute this computation always for 1 more unnecessary 
vertex.

> RangeGlobalStep touches next traverser when high limit is already hit
> ---------------------------------------------------------------------
>
>                 Key: TINKERPOP-2490
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2490
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: process
>    Affects Versions: 3.4.8
>            Reporter: Guo Junshi
>            Priority: Major
>
> In FilterStep, the processNextStart() method will first retrieve next 
> traverser and then apply filtering logic. But for RangleGlobalStep, if high 
> limit is already hit, there will be no need to get next traverser.
> {code:java}
> @Override
> protected Traverser.Admin<S> processNextStart() {
>     while (true) {
>         final Traverser.Admin<S> traverser = this.starts.next();
>         if (this.filter(traverser))
>             return traverser;
>     }
> }
> {code}
> An example would be limit step: g.V().limit(1). This query will touch 2 
> vertices although only 1 vertex will be returned.
> This extra data loading will cause performance defects if DB data loading is 
> involved. It is not a functionality bug, but for better performance, we'd 
> better check high range limit first before touching next traversal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to