[ 
https://issues.apache.org/jira/browse/TINKERPOP-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715886#comment-17715886
 ] 

Martin Häusler commented on TINKERPOP-2932:
-------------------------------------------

Upon further inspection (and I'm leaning quite far out of the window here, 
please correct me if I'm wrong) this optimization step inserts a {{NoneStep}} 
under specific circumstances and then drops other steps. I believe that the 
original intention of the code was that a {{.limit(0)}} should *always* be 
replaced with a {{NoneStep}} (followed by step removal), but the implementation 
has a bug which causes these actions to be carried out if and only if multiple 
{{.limit(0)}} are contained in the query. Since that is rarely ever the case in 
practice, the actual optimization carried out by this strategy was never 
executed, which is why nobody encountered any issues with it. It was just a 
no-op in the vast majority of cases.

Instead of attempting to move a {{.limit(0)}} as far as possible to the 
beginning of the query and removing the subsequent steps as part of 
"housekeeping" (risking to change query semantics in the process, as 
demonstrated by this ticket) wouldn't it be smarter to move a {{.limit(0)}} as 
far as possible towards the *end* of the traversal steps? This way, the 
{{hasNext()}} of the {{NoneStep}} would trigger earlier, which means all the 
work for the prior steps is never carried out, so it doesn't matter if they 
remain in the query or not. The only restriction is that we must not move a 
{{limit(0)}} across any step which can introduce new data ({{cap(...)}} and 
{{inject(...)}} come to mind).

> EarlyLimitStrategy drops too many steps if two .limit(0) steps exist in the 
> query
> ---------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2932
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2932
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: language
>    Affects Versions: 3.6.2
>            Reporter: Martin Häusler
>            Priority: Major
>
> Please consider the following query (it will work on any graph, content 
> doesn't matter):
>  
> {code:java}
> int result = graph.traversal().inject(0).union(
>     __.V<Vertex>()
>         .has("kind", "entity")
>         .limit(0)
>         .limit(0)
>         .fold()
>         .sideEffect(
>             
> __.unfold<Vertex>().valueMap<String>("type").with(WithOptions.tokens, 
> WithOptions.ids).aggregate("groupByType")
>         )
>         .unfold<Vertex>()
>         .aggregate("result")
>         .cap<Map<String, Object>>("groupByType", "result")
>         .map( traverser -> {
>             val map = traverser.get()
>             return map.size()
>         })
> ).next() {code}
> Please note that the result type is {{int}}. This is perfectly fine and also 
> compiles without issues. However, at runtime, this query throws a 
> *{{ClassCastException}}*, claiming that {{java.util.HashMap}} cannot be cast 
> to {{int}}.
> Upon further investigation, I've discovered that the query looks fine in the 
> optimizer:
> {code}
> [InjectStep([<Dummy>]), UnionStep([[GraphStep(vertex,[]), 
> HasStep([kind.eq(entity)]), RangeGlobalStep(0,0), RangeGlobalStep(0,0), 
> FoldStep, TraversalSideEffectStep([UnfoldStep, PropertyMapStep([type],value), 
> AggregateGlobalStep(groupByType,null)]), UnfoldStep, 
> AggregateGlobalStep(result,null), SideEffectCapStep([result, groupByType]), 
> LambdaMapStep(lambda), EndStep]])]
> {code}
> ... *until* the {{EarlyLimitStrategy}} is executed. After this strategy has 
> been executed, the query looks like this:
> {code}
> [InjectStep([<Dummy>]), UnionStep([[GraphStep(vertex,[]), 
> HasStep([kind.eq(entity)]), NoneStep, SideEffectCapStep([result, 
> groupByType])]])]
> {code}
> It is perfectly fine that the side effect was dropped, but the problem is: 
> the trailing lambda {{map}} step was also dropped. *This changes the result 
> type of the query!*
> My *suggested fix* is to stop the {{for}} loop in {{EarlyLimitStrategy}} 
> which drops steps as soon as an {{inject}} or {{cap}} step is reached. All 
> steps afterwards must not be touched by this optimization.
> A *workaround* is to only have a single {{.limit(0)}} in the query. However, 
> I'd argue that it should still work even if there are multiple {{.limit(0)}} 
> steps as it doesn't change the semantics of the query at all. The query I've 
> posted above is auto-generated by our application, that's why it may appear 
> strange.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to