[ https://issues.apache.org/jira/browse/HIVE-26722?focusedWorklogId=826494&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-826494 ]
ASF GitHub Bot logged work on HIVE-26722: ----------------------------------------- Author: ASF GitHub Bot Created on: 16/Nov/22 11:56 Start Date: 16/Nov/22 11:56 Worklog Time Spent: 10m Work Description: sonarcloud[bot] commented on PR #3748: URL: https://github.com/apache/hive/pull/3748#issuecomment-1316886318 Kudos, SonarCloud Quality Gate passed! [![Quality Gate passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png 'Quality Gate passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3748) [![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png 'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=BUG) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=BUG) [0 Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=BUG) [![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png 'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=VULNERABILITY) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=VULNERABILITY) [0 Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=VULNERABILITY) [![Security Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png 'Security Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3748&resolved=false&types=SECURITY_HOTSPOT) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3748&resolved=false&types=SECURITY_HOTSPOT) [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3748&resolved=false&types=SECURITY_HOTSPOT) [![Code Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png 'Code Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=CODE_SMELL) [![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png 'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=CODE_SMELL) [0 Code Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3748&resolved=false&types=CODE_SMELL) [![No Coverage information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png 'No Coverage information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3748&metric=coverage&view=list) No Coverage information [![No Duplication information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png 'No Duplication information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3748&metric=duplicated_lines_density&view=list) No Duplication information Issue Time Tracking ------------------- Worklog Id: (was: 826494) Time Spent: 1h (was: 50m) > HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands > ------------------------------------------------------------------ > > Key: HIVE-26722 > URL: https://issues.apache.org/jira/browse/HIVE-26722 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 4.0.0-alpha-1 > Reporter: Alessandro Solimando > Assignee: Alessandro Solimando > Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > h1. Reproducer > Consider the following query: > {code:java} > set hive.cbo.rule.exclusion.regex=ReduceExpressionsRule\(Project\); > CREATE EXTERNAL TABLE t (a string, b string); > INSERT INTO t VALUES ('1000', 'b1'); > INSERT INTO t VALUES ('2000', 'b2'); > SELECT * FROM ( > SELECT > a, > b > FROM t > UNION ALL > SELECT > a, > CAST(NULL AS string) > FROM t) AS t2 > WHERE a = 1000;EXPLAIN CBO > SELECT * FROM ( > SELECT > a, > b > FROM t > UNION ALL > SELECT > a, > CAST(NULL AS string) > FROM t) AS t2 > WHERE a = 1000; {code} > > The expected result is: > {code:java} > 1000 b1 > 1000 NULL{code} > An example of correct plan is as follows: > {noformat} > CBO PLAN: > HiveUnion(all=[true]) > HiveProject(a=[$0], b=[$1]) > HiveFilter(condition=[=(CAST($0):DOUBLE, 1000)]) > HiveTableScan(table=[[default, t]], table:alias=[t]) > HiveProject(a=[$0], _o__c1=[null:VARCHAR(2147483647) CHARACTER SET > "UTF-16LE"]) > HiveFilter(condition=[=(CAST($0):DOUBLE, 1000)]) > HiveTableScan(table=[[default, t]], table:alias=[t]){noformat} > > Consider now a scenario where expression reduction in projections is disabled > by setting the following property{_}:{_} > {noformat} > set hive.cbo.rule.exclusion.regex=ReduceExpressionsRule\(Project\); > {noformat} > In this case, the simplification of _CAST(NULL)_ into _NULL_ does not happen, > and we get the following (invalid) result: > {code:java} > 1000 b1{code} > produced by the following invalid plan: > {code:java} > CBO PLAN: > HiveProject(a=[$0], b=[$1]) > HiveFilter(condition=[=(CAST($0):DOUBLE, 1000)]) > HiveTableScan(table=[[default, t]], table:alias=[t]) {code} > h1. Problem Analysis > At > [HiveFilterSetOpTransposeRule.java#L112|https://github.com/apache/hive/blob/297f510d3b581c9d4079e42caa28aa84f8486012/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterSetOpTransposeRule.java#L112] > the _RelMetadataQuery::getPulledUpPredicates_ method infers the following > predicate due to the CAST(NULL) in the projection: > {code:java} > (=($1, CAST(null:NULL):VARCHAR(2147483647) CHARACTER SET "UTF-16LE")){code} > When the CAST is simplified to the NULL literal, the IS_NULL($1) predicate is > inferred. > In > [HiveFilterSetOpTransposeRule.java#L114-L122|https://github.com/apache/hive/blob/297f510d3b581c9d4079e42caa28aa84f8486012/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterSetOpTransposeRule.java#L114-L122], > the rule checks if the conjunction of the predicate coming from the filter > (here =(CAST($0):DOUBLE, 1000)) and the inferred predicates is satisfiable or > not, under the _UnknownAsFalse_ semantics. > To summarize, the following expression is simplified under the > _UnknownAsFalse_ semantics: > {code:java} > AND((=($1, CAST(null:NULL):VARCHAR(2147483647) CHARACTER SET "UTF-16LE")), > =(CAST($0):DOUBLE, 1000)) > {code} > Under In such semantics, (=($1, CAST(null:NULL):...) evaluates to > {_}FALSE{_}, because no value is equal to NULL (even NULL itself), AND(FALSE, > =(CAST($0):DOUBLE, 1000)) necessarily evaluates to _FALSE_ altogether, and > the UNION ALL operand is pruned. > Only by chance, when _CAST(NULL)_ is simplified to _NULL,_ we avoid the > issue, due to the _IS_NULL($1)_ inferred predicate, see > [HiveRelMdPredicates.java#L153-L156|https://github.com/apache/hive/blob/297f510d3b581c9d4079e42caa28aa84f8486012/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdPredicates.java#L153-L156] > for understanding how the NULL literal is treated differently during > predicate inference. > The problem lies in the fact that, depending on the input _RelNode_ that we > infer predicates from, the semantics is not necessarily {_}UnknownAsFalse{_}, > but it might be {_}UnknownAsUnknown{_}, like for {_}Project{_}, as in this > case. > h1. Solution > In order to correctly simplify a predicate and test if it's always false or > not, we should build RexSimplify with _predicates_ as the list of predicates > known to hold in the context. In this way, the different semantics are > correctly taken into account. > The code at > [HiveFilterSetOpTransposeRule.java#L114-L121|https://github.com/apache/hive/blob/297f510d3b581c9d4079e42caa28aa84f8486012/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterSetOpTransposeRule.java#L114-L121] > should be replaced by the following: > {code:java} > final RexExecutor executor = > Util.first(filterRel.getCluster().getPlanner().getExecutor(), > RexUtil.EXECUTOR); > final RexSimplify simplify = new RexSimplify(rexBuilder, predicates, > executor); > final RexNode x = simplify.simplifyUnknownAs(newCondition, > RexUnknownAs.FALSE);{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)