[
https://issues.apache.org/jira/browse/TRAFODION-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444959#comment-16444959
]
ASF GitHub Bot commented on TRAFODION-3031:
-------------------------------------------
GitHub user DaveBirdsall opened a pull request:
https://github.com/apache/trafodion/pull/1530
[TRAFODION-3031] Fix two issues with nested subquery plans
See the JIRA for a discussion of the problem.
This set of changes contains two fixes:
1. The heuristics that decide whether to do the semi-join to join +
group-by have been modified. First, if the group by would reduce the number of
rows in the inner table by a factor of 5.0 or more, the transformation will be
enabled. The rationale is that this may reduce the size of the inner hash table
if hash join is chosen. Also it allows the join to be commuted, which may lead
to a better plan. Second, if the inner table has only a small number of rows
(less than 100.0), the transformation will be enabled. The rationale here is
that it allows the join to be commuted which may lead to an efficient nested
join plan. Both of the constants here are controlled by new CQDs. (Note: Thanks
go to @sureshsubbiah for the first of these ideas and the code to do it.)
2. There was a bug in Scan::addIndexInfo, where we would sometimes overlook
a useful index. We would mistakenly think that another index provided the same
or better information when in fact one index would satisfy a join predicate
that the other would not. This bug has been fixed.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/DaveBirdsall/trafodion NestedSubqueryProblem
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/trafodion/pull/1530.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1530
----
commit ee91b337ae1d23ad2b15034602e06c4d3bf1103d
Author: Dave Birdsall <dbirdsall@...>
Date: 2018-04-19T22:59:16Z
[TRAFODION-3031] Fix two issues with nested subquery plans
----
> Query with nested subqueries chooses bad plan
> ---------------------------------------------
>
> Key: TRAFODION-3031
> URL: https://issues.apache.org/jira/browse/TRAFODION-3031
> Project: Apache Trafodion
> Issue Type: Improvement
> Components: sql-cmp
> Affects Versions: 2.3
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
> Priority: Major
> Attachments: jira.log, jira.sql
>
>
> The attached files demonstrate the problem. The file jira.sql is a script
> that reproduces the problem, while jira.log is a sqlci showing the results.
> The query in question does an IN-subquery from T1 to T2, then T2 has an
> =-subquery back to T1. T2 contains two indexes, one each on the join columns.
> The default plan uses a hybrid hash join of T1 to T2 and is very slow. It
> does a full scan of both T1 and T2.
> If we set CQD SEMIJOIN_TO_INNERJOIN_TRANSFORMATION 'ON', the plan is a little
> bit better. We get a nested join of T1 to T2. But it is inefficient; we still
> do a full scan of T2.
> If we rename the index T2A to T2Y, and we still have the CQD set, we get a
> good nested join plan that uses the index T2Y and reads just one row at each
> level. This is very fast.
> So, there are two issues here.
> # We could do a better job of deciding when to do the semi-join to join
> transformation. When the inner table is small, it is profitable to do this.
> # The index elimination logic is mistakenly eliminating index T2A so the
> Optimizer misses a chance to use it and so does not find the efficient nested
> join plan.[^jira.log]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)