[
https://issues.apache.org/jira/browse/HIVE-20331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576918#comment-16576918
]
Aihua Xu commented on HIVE-20331:
---------------------------------
GenMRRedSink3 won't get triggered in this case (it's triggered in union
followed by RS operator).
Here are the plans with and without the patch. As you can see, the union
operator is incorrectly to Stage-4.
Before:
{noformat}
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-6 depends on stages: Stage-4
Stage-2 depends on stages: Stage-6
Stage-0 depends on stages: Stage-2
STAGE PLANS:
Stage: Stage-4
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Reduce Output Operator
key expressions: col1 (type: int)
sort order: +
Map-reduce partition columns: col1 (type: int)
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
TableScan
Union
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
output shape: _col0: int
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
order by: _col0 ASC NULLS FIRST
partition by: _col0
raw input shape:
window functions:
window function definition
alias: Row_Number_window_0
name: Row_Number
window function: GenericUDAFRowNumberEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Lateral View Forward
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
Lateral View Join Operator
outputColumnNames: _col1, _col2
Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE
Column stats: NONE
Select Operator
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Select Operator
expressions: map(10:1) (type: map<int,int>)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
UDTF Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
function name: explode
Lateral View Join Operator
outputColumnNames: _col1, _col2
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-6
Map Reduce Local Work
Alias -> Map Local Tables:
_u1-subquery2:x1:t1
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
_u1-subquery2:x1:t1
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: COMPLETE
Select Operator
Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE
Column stats: COMPLETE
HashTable Sink Operator
keys:
0
1
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: COMPLETE
Select Operator
expressions: 1 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column
stats: COMPLETE
Union
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
TableScan
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0
1
Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE
Column stats: NONE
Select Operator
expressions: 2 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE
Column stats: NONE
Union
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 10 Data size: 88 Basic stats:
COMPLETE Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Local Work:
Map Reduce Local Work
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
{noformat}
After
{noformat}
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-6 depends on stages: Stage-4
Stage-2 depends on stages: Stage-6
Stage-0 depends on stages: Stage-2
STAGE PLANS:
Stage: Stage-4
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Reduce Output Operator
key expressions: col1 (type: int)
sort order: +
Map-reduce partition columns: col1 (type: int)
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Execution mode: vectorized
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
output shape: _col0: int
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
order by: _col0 ASC NULLS FIRST
partition by: _col0
raw input shape:
window functions:
window function definition
alias: Row_Number_window_0
name: Row_Number
window function: GenericUDAFRowNumberEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: NONE
Lateral View Forward
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
Select Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
Lateral View Join Operator
outputColumnNames: _col1, _col2
Statistics: Num rows: 4 Data size: 4 Basic stats: COMPLETE
Column stats: NONE
Select Operator
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Select Operator
expressions: map(10:1) (type: map<int,int>)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
UDTF Operator
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE
Column stats: NONE
function name: explode
Lateral View Join Operator
outputColumnNames: _col1, _col2
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
Select Operator
Statistics: Num rows: 4 Data size: 4 Basic stats:
COMPLETE Column stats: NONE
File Output Operator
compressed: false
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
Stage: Stage-6
Map Reduce Local Work
Alias -> Map Local Tables:
_u1-subquery2:x1:t1
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
_u1-subquery2:x1:t1
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: COMPLETE
Select Operator
Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE
Column stats: COMPLETE
HashTable Sink Operator
keys:
0
1
Stage: Stage-2
Map Reduce
Map Operator Tree:
TableScan
alias: t1
Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE Column
stats: COMPLETE
Select Operator
expressions: 1 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column
stats: COMPLETE
Union
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
TableScan
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0
1
Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE
Column stats: NONE
Select Operator
expressions: 2 (type: int)
outputColumnNames: _col0
Statistics: Num rows: 8 Data size: 80 Basic stats: COMPLETE
Column stats: NONE
Union
Statistics: Num rows: 10 Data size: 88 Basic stats: COMPLETE
Column stats: PARTIAL
File Output Operator
compressed: false
Statistics: Num rows: 10 Data size: 88 Basic stats:
COMPLETE Column stats: PARTIAL
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Local Work:
Map Reduce Local Work
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
{noformat}
> Query with union all, lateral view and Join fails with "cannot find parent in
> the child operator"
> -------------------------------------------------------------------------------------------------
>
> Key: HIVE-20331
> URL: https://issues.apache.org/jira/browse/HIVE-20331
> Project: Hive
> Issue Type: Bug
> Components: Physical Optimizer
> Affects Versions: 2.1.1
> Reporter: Aihua Xu
> Assignee: Aihua Xu
> Priority: Major
> Attachments: HIVE-20331.1.patch
>
>
> The following query with Union, Lateral view and Join will fail during
> execution with the exception below.
> {noformat}
> create table t1(col1 int);
> SELECT 1 AS `col1`
> FROM t1
> UNION ALL
> SELECT 2 AS `col1`
> FROM
> (SELECT col1
> FROM t1
> ) x1
> JOIN
> (SELECT col1
> FROM
> (SELECT
> Row_Number() over (PARTITION BY col1 ORDER BY col1) AS `col1`
> FROM t1
> ) x2 lateral VIEW explode(map(10,1))`mapObj` AS `col2`, `col3`
> ) `expdObj`
> {noformat}
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive internal
> error: cannot find parent in the child operator!
> at
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:509)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {noformat}
> After debugging, seems we have issues in GenMRFileSink1 class in which we are
> setting incorrect aliasToWork to the MapWork.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)