[
https://issues.apache.org/jira/browse/HIVE-26779?focusedWorklogId=859122&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-859122
]
ASF GitHub Bot logged work on HIVE-26779:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 26/Apr/23 09:54
Start Date: 26/Apr/23 09:54
Worklog Time Spent: 10m
Work Description: kasakrisz opened a new pull request, #4272:
URL: https://github.com/apache/hive/pull/4272
<!--
Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/Hive/HowToContribute
2. Ensure that you have created an issue on the Hive project JIRA:
https://issues.apache.org/jira/projects/HIVE/summary
3. Ensure you have added or run the appropriate tests for your PR:
4. If the PR is unfinished, add '[WIP]' in your PR title, e.g.,
'[WIP]HIVE-XXXXX: Your PR title ...'.
5. Be sure to keep the PR description updated to reflect all changes.
6. Please write your PR title to summarize what this PR proposes.
7. If possible, provide a concise example to reproduce the issue for a
faster review.
-->
### What changes were proposed in this pull request?
`GenTezUtils.removeUnionOperators` tries to remove union operators from
union branches of the plan. I assume it is done because each branch has its own
Map vertex and the Union vertex takes the union of the result these.
Let's see the following query
```
SELECT NULL AS first_login_did
FROM tez_test_t5
LATERAL VIEW explode(split('0,6', ',')) gaps AS ads_h5_gap
UNION ALL
SELECT null as first_login_did
FROM tez_test_t1
UNION ALL
SELECT did AS first_login_did
FROM tez_test_t2
;
```
The plan before `removeUnionOperators`
```
TS[0]-LVF[1]-SEL[2] -LVJ[5]-SEL[6]-UNION[9]-SEL[12]-UNION[13]-FS[15]
-SEL[3]-UDTF[4]-LVJ[5]
TS[7]-SEL[8] -UNION[9]
TS[10]-SEL[11] -UNION[13]
```
The branch having lateral view has two sub branches from LVF (Lateral View
Forward op) to LVJ (Lateral View Join op).
The `removeUnionOperators` traverse the branches and it tries to remove the
UNION[9] operator twice. It fails the second time with the exception mentioned
in the jira.
### Why are the changes needed?
Fix issue mentioned above.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
```
mvn test -Dtest.output.overwrite -DskipSparkTests
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=lateral_view_unionall.q -pl
itests/qtest -Pitests
```
Issue Time Tracking
-------------------
Worklog Id: (was: 859122)
Remaining Estimate: 0h
Time Spent: 10m
> UNION ALL throws SemanticException when trying to remove partition
> predicates: fail to find child from parent
> -------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-26779
> URL: https://issues.apache.org/jira/browse/HIVE-26779
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 4.0.0-alpha-2
> Reporter: Zhizhen Hou
> Assignee: Zhizhen Hou
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {quote}Reproduce sql:
> drop table if exists tez_test_t1;
> create table tez_test_t1(md_exper string);
> insert into tez_test_t1 values('tez_test_t1-md_expr');
> drop table if exists tez_test_t5;
> create table tez_test_t5(md_exper string, did string);
> insert into tez_test_t5 values('tez_test_t5-md_expr','tez_test_t5-did');
> drop table if exists tez_test_t2;
> create table tez_test_t2(did string);
> insert into tez_test_t2 values('tez_test_t2-did');
> SELECT md_exper,null as ads_h5_gap , null as first_login_did, null as
> inclick_did
> FROM tez_test_t1
> UNION ALL
> SELECT md_exper, ads_h5_gap ,
> NULL AS first_login_did,did AS inclick_did
> FROM tez_test_t5
> LATERAL VIEW explode(split('0,6', ',')) gaps AS ads_h5_gap
> UNION ALL
> SELECT '' AS md_exper,'0,6' as ads_h5_gap ,
> did AS first_login_did, NULL AS inclick_did
> FROM tez_test_t2
> GROUP BY did;
> {quote}
>
> StackTrace
> 2022-11-27T09:31:06,801 ERROR [21d35a7f-9625-46ae-9c3d-13ca925f55cb main]:
> ql.Driver (:()) - FAILED: SemanticException Exception when trying to remove
> partition predicates: fail to find child from parent
> org.apache.hadoop.hive.ql.parse.SemanticException: Exception when trying to
> remove partition predicates: fail to find child from parent
> at
> org.apache.hadoop.hive.ql.exec.Operator.removeChildAndAdoptItsChildren(Operator.java:859)
> at
> org.apache.hadoop.hive.ql.parse.GenTezUtils.removeUnionOperators(GenTezUtils.java:348)
> at
> org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:573)
> at
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:241)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12333)
> at
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:286)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)