[
https://issues.apache.org/jira/browse/HIVE-29164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on HIVE-29164 started by Thomas Rebele.
--------------------------------------------
> AntiJoin rule fails to apply with an extra constant in the SELECT clause
> ------------------------------------------------------------------------
>
> Key: HIVE-29164
> URL: https://issues.apache.org/jira/browse/HIVE-29164
> Project: Hive
> Issue Type: Bug
> Affects Versions: 4.1.0
> Reporter: Thomas Rebele
> Assignee: Thomas Rebele
> Priority: Major
> Labels: pull-request-available
>
> Adding a constant to the SELECT clause prevents Hive from applying the
> HiveAntiSemiJoinRule.
> Example:
> {code:java}
> CREATE EXTERNAL TABLE tab(fieldA string, fieldB string);
> explain cbo select t1.fieldA
> from tab t1
> left join tab t2
> on t2.fieldA=t1.fieldA
> where t2.fieldA is null;
> {code}
> results in a plan
> {code:java}
> HiveAntiJoin(condition=[=($1, $0)], joinType=[anti])
> HiveProject(fielda=[$0])
> HiveTableScan(table=[[default, tab]], table:alias=[t1])
> HiveProject(fielda=[$0])
> HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[default, tab]], table:alias=[t2])
> {code}
> However, including a constant column "abc" in the result
> {code:java}
> CREATE EXTERNAL TABLE tab(fieldA string, fieldB string);
> explain cbo select t1.fieldA, "abc"
> from tab t1
> left join tab t2
> on t2.fieldA=t1.fieldA
> where t2.fieldA is null;
> {code}
> leads to a plan without an anti-join:
> {code:java}
> HiveProject(fielda=[$0], _o__c1=[_UTF-16LE'abc':VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE"])
> HiveFilter(condition=[IS NULL($1)])
> HiveJoin(condition=[=($1, $0)], joinType=[left], algorithm=[none],
> cost=[not available])
> HiveProject(fielda=[$0])
> HiveTableScan(table=[[default, tab]], table:alias=[t1])
> HiveProject(fielda=[$0])
> HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[default, tab]], table:alias=[t2])
> {code}
> The cause is a bug in
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveCalciteUtil#hasAnyExpressionFromRightSide.
> The condition {{rightBitmap.contains(inputBits)}} returns true if no field
> from the RHS of the join is used at all.
> Thanks to [~Dayakar] and [~krisztiankasa] for finding and discussing the
> issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)