[
https://issues.apache.org/jira/browse/HIVE-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zoltan Haindrich updated HIVE-25170:
------------------------------------
Fix Version/s: 4.0.0
Resolution: Fixed
Status: Resolved (was: Patch Available)
merged into master. Thank you [~zhangweilst]
> Data error in constant propagation caused by wrong colExprMap generated in
> SemanticAnalyzer
> -------------------------------------------------------------------------------------------
>
> Key: HIVE-25170
> URL: https://issues.apache.org/jira/browse/HIVE-25170
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Affects Versions: 3.1.2
> Reporter: Wei Zhang
> Assignee: Wei Zhang
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
>
> {code:java}
> SET hive.remove.orderby.in.subquery=false;
> EXPLAIN
> SELECT constant_col, key, max(value)
> FROM
> (
> SELECT 'constant' as constant_col, key, value
> FROM src
> DISTRIBUTE BY constant_col, key
> SORT BY constant_col, key, value
> ) a
> GROUP BY constant_col, key
> LIMIT 10;
> OK
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)Stage-0
> Fetch Operator
> limit:10
> Stage-1
> Reducer 3
> File Output Operator [FS_10]
> Limit [LIM_9] (rows=1 width=368)
> Number of rows:10
> Select Operator [SEL_8] (rows=1 width=368)
> Output:["_col0","_col1","_col2"]
> Group By Operator [GBY_7] (rows=1 width=368)
>
> Output:["_col0","_col1","_col2"],aggregations:["max(VALUE._col0)"],keys:'constant',
> 'constant'
> <-Reducer 2 [SIMPLE_EDGE]
> SHUFFLE [RS_6]
> PartitionCols:'constant', 'constant'
> Group By Operator [GBY_5] (rows=1 width=368)
>
> Output:["_col0","_col1","_col2"],aggregations:["max(_col2)"],keys:'constant',
> 'constant'
> Select Operator [SEL_3] (rows=500 width=178)
> Output:["_col2"]
> <-Map 1 [SIMPLE_EDGE]
> SHUFFLE [RS_2]
> PartitionCols:'constant', _col1
> Select Operator [SEL_1] (rows=500 width=178)
> Output:["_col1","_col2"]
> TableScan [TS_0] (rows=500 width=10)
>
> src,src,Tbl:COMPLETE,Col:COMPLETE,Output:["key","value"]{code}
> Obviously, the PartitionCols in Reducer 2 is wrong. Instead of 'constant',
> 'constant', it should be 'constant', _col1
>
> That's because after HIVE-13808, SemanticAnalyzer uses sortCols to generate
> the colExprMap structure in the key part, while the key columns are generated
> by newSortCols, leading to a column and expr mismatch when the constant
> column is not the trailing column in the key columns.
> Constant propagation optimizer uses this colExprMap and finds extra const
> expression in the mismatched map, resulting in this error.
>
> In fact, colExprMap is used by multiple optimizers, which makes this quite a
> serious problem.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)