Taewoo Kim closed ASTERIXDB-1246.
    Resolution: Fixed

> Unnecessary decor variables of a group-by are not removed until 
> PushProjectDownRule is fired.
> ---------------------------------------------------------------------------------------------
>                 Key: ASTERIXDB-1246
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1246
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
> Unnecessary decor variables of a group-by is not removed until 
> PushProjectDownRule is fired.
> Currently, group-by for a subplan is introduced when 
> IntroduceGroupByForSubplanRule is fired. At this time, decor variables for 
> the new group-by operator are also added based on the variable usage after 
> the new group-by operator.
> After this rule, other optimizations might make decor variables unnecessary. 
> One example is that an assign after group-by can be moved before the group-by 
> operator so that a record variable (e.g., $$0) that is required for the given 
> assign does not need to be passed through the group-by operator. These 
> unnecessary decor variables will be removed only when PushProjectDownRule is 
> fired. 
> As the rule name suggests, PushProjectDownRule rule will be fired only when 
> we have a project operator in the plan. Currently in my branch (index-only 
> plan branch), this affects the IntroduceSelectAccessMethodRule, which 
> transforms a plan into indexes-utilization plan. In this rule, it checks 
> whether the given plan is an index-only plan by checking variables used after 
> a SELECT operator. If only secondary key and/or primary key are used, then 
> the given plan is an index-only plan and we can use a secodnary-index search 
> to return SK and PK. 
> The issue is that IntroduceSelectAccessMethodRule is fired before 
> PushProjectDownRule and generally there is no project is introduced in the 
> plan before IntroduceSelectAccessMethodRule. So, these unnecessary decor 
> variables are not used; however, they still sit in the plan so that the 
> optimizer wrongly decides the given plan as a non-index-only plan. The 
> following is an example query. If we have a secondary index on count1 
> (PK:tweetid), then this should be qualified as an index-only plan for the 
> outer branch. In fact, it doesn't because of unnecessary decor variables that 
> still sit after some optimizations.
> for $t1 in dataset('TweetMessages')
> where $t1.countA > 0
> return {
> "tweetid1": $t1.tweetid,
> "count1":$t1.countA,
> "t2info": for $t2 in dataset('TweetMessages')
>                         where $t1.countA /* +indexnl */= $t2.tweetid
>                         return {"tweetid2": $t2.tweetid,
>                                 "count2": $t2.countB}
> }
> We can separate PushProjectDownRule rule into two rules: push project down 
> and clean decor variables. 

This message was sent by Atlassian JIRA

Reply via email to