Wei Zhang created HIVE-25170: -------------------------------- Summary: Data error in constant propagation caused by wrong colExprMap generated in SemanticAnalyzer Key: HIVE-25170 URL: https://issues.apache.org/jira/browse/HIVE-25170 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 3.1.2 Reporter: Wei Zhang Assignee: Wei Zhang
{code:java} // code placeholder EXPLAIN SELECT constant_col, key, max(value) FROM ( SELECT 'constant' as constant_col, key, value FROM src DISTRIBUTE BY constant_col, key SORT BY constant_col, key, value ) a GROUP BY constant_col, key LIMIT 10; OK Vertex dependency in root stage Reducer 2 <- Map 1 (SIMPLE_EDGE) Reducer 3 <- Reducer 2 (SIMPLE_EDGE)Stage-0 Fetch Operator limit:10 Stage-1 Reducer 3 File Output Operator [FS_10] Limit [LIM_9] (rows=1 width=368) Number of rows:10 Select Operator [SEL_8] (rows=1 width=368) Output:["_col0","_col1","_col2"] Group By Operator [GBY_7] (rows=1 width=368) Output:["_col0","_col1","_col2"],aggregations:["max(VALUE._col0)"],keys:'constant', 'constant' <-Reducer 2 [SIMPLE_EDGE] SHUFFLE [RS_6] PartitionCols:'constant', 'constant' Group By Operator [GBY_5] (rows=1 width=368) Output:["_col0","_col1","_col2"],aggregations:["max(_col2)"],keys:'constant', 'constant' Select Operator [SEL_3] (rows=500 width=178) Output:["_col2"] <-Map 1 [SIMPLE_EDGE] SHUFFLE [RS_2] PartitionCols:'constant', _col1 Select Operator [SEL_1] (rows=500 width=178) Output:["_col1","_col2"] TableScan [TS_0] (rows=500 width=10) src,src,Tbl:COMPLETE,Col:COMPLETE,Output:["key","value"]{code} Obviously, the `PartitionCols` in Reducer 2 is wrong. Instead of `'constant', 'constant'`, it should be `'constant', _col1` That's because after HIVE-13808, `SemanticAnalyzer` uses `sortCols` to generate the `colExprMap` structure in the key part, while the key columns are generated by `newSortCols`, leading to a column and expr mismatch when the constant column is not the trailing column in the key columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)