Re: [PR] HIVE-29267: Fix NPE on Grouping Sets Optimizer for UNION ALL Queries [hive]

via GitHub Fri, 17 Oct 2025 14:43:02 -0700


okumin commented on code in PR #6128:
URL: https://github.com/apache/hive/pull/6128#discussion_r2431314962



##########
ql/src/test/queries/clientpositive/groupingset_optimize_hive_28489.q:
##########
@@ -1,6 +1,22 @@
 -- SORT_QUERY_RESULTS
 
 create table grp_set_test (key string, value string, col0 int, col1 int, col2 
int, col3 int);
+
+-- UNION case, can't be optimized
+set hive.optimize.grouping.set.threshold=1;
+with sub_qr as (select col2 from grp_set_test)
+select grpBy_col, sum(col2)
+from
+( select 'abc' as grpBy_col, col2 from sub_qr union all select 'def' as 
grpBy_col, col2 from sub_qr) x
+group by grpBy_col with rollup;

Review Comment:
   I confirmed the master branch definitely throws a NPE



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupingSetOptimizer.java:
##########
@@ -227,7 +227,8 @@ private String selectPartitionColumn(GroupByOperator gby, 
Operator<?> parentOp)
       String partitionCol = null;
       for (ColStatistics col: columnStatistics) {
         String colName = col.getColumnName();
-        if (parentOp.getColumnExprMap().containsKey(colName) && 
candidates.contains(colName)) {
+        if (null != parentOp.getColumnExprMap() && 
parentOp.getColumnExprMap().containsKey(colName) &&
+                candidates.contains(colName)) {

Review Comment:
   Confidence = 20%.  In my very rough feeling, the safest approach is to 
reject UNION + GROUPING SETS here. I'm still recalling the behavior of this 
optimizer
   
https://github.com/apache/hive/blob/5050529286c7ae131af0602f22e75c2ab72319fe/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupingSetOptimizer.java#L143-L182
   
   cc: @ngsg 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-29267: Fix NPE on Grouping Sets Optimizer for UNION ALL Queries [hive]

Reply via email to