[ 
https://issues.apache.org/jira/browse/TRAFODION-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444984#comment-16444984
 ] 

ASF GitHub Bot commented on TRAFODION-3031:
-------------------------------------------

Github user zellerh commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1530#discussion_r182911028
  
    --- Diff: core/sql/optimizer/NormRelExpr.cpp ---
    @@ -2767,25 +2767,26 @@ Here t2.a is a unique key of table t2.
     The following transformation is made
              Semi Join {pred : t1.b = t2.a}          Join {pred : t1.b = t2.a} 
             /         \                   ------->  /    \
    -      /             \                         /        \
    -Scan t1     Scan t2                 Scan t1     Scan t2
    +       /           \                           /      \
    + Scan t1        Scan t2                   Scan t1     Scan t2
                                                     
     
                                                
     b) If the right child is not unique in the joining column then 
     we transform the semijoin into an inner join followed by a groupby
     as the join's right child. This transformation is enabled by default
    -only if the right side is an IN list, otherwise a CQD has to be used.
    +only if the right side is an IN list or if the groupby's reduction 
    +ratio is greater than 5.0, otherwise a CQD has to be used.
     
     select t1.a
     from t1
     where t1.b in (1,2,3,4,...,101) ;
     
     
    -  Semi Join {pred : t1.b = t2.a}          Join {pred : t1.b = InList.col} 
    +  Semi Join {pred : t1.b = InList.col}  Join {pred : t1.b = InList.col}
      /         \                   ------->  /    \
     /           \                           /      \
    -Scan t1     Scan t2                 Scan t1     GroupBy {group cols: 
InList.col}
    +Scan t1   TupleList                 Scan t1   GroupBy {group cols: 
InList.col}
                                                       |
    --- End diff --
    
    Nice to make the picture consistent, but from the code it looks like we do 
this for things other than TupleList, so maybe "Scan t2" or "Q2" would be a 
better name for the child?


> Query with nested subqueries chooses bad plan
> ---------------------------------------------
>
>                 Key: TRAFODION-3031
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3031
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-cmp
>    Affects Versions: 2.3
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>            Priority: Major
>         Attachments: jira.log, jira.sql
>
>
> The attached files demonstrate the problem. The file jira.sql is a script 
> that reproduces the problem, while jira.log is a sqlci showing the results.
> The query in question does an IN-subquery from T1 to T2, then T2 has an 
> =-subquery back to T1. T2 contains two indexes, one each on the join columns.
> The default plan uses a hybrid hash join of T1 to T2 and is very slow. It 
> does a full scan of both T1 and T2.
> If we set CQD SEMIJOIN_TO_INNERJOIN_TRANSFORMATION 'ON', the plan is a little 
> bit better. We get a nested join of T1 to T2. But it is inefficient; we still 
> do a full scan of T2.
> If we rename the index T2A to T2Y, and we still have the CQD set, we get a 
> good nested join plan that uses the index T2Y and reads just one row at each 
> level. This is very fast.
> So, there are two issues here.
>  # We could do a better job of deciding when to do the semi-join to join 
> transformation. When the inner table is small, it is profitable to do this.
>  # The index elimination logic is mistakenly eliminating index T2A so the 
> Optimizer misses a chance to use it and so does not find the efficient nested 
> join plan.[^jira.log]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to