[GitHub] [hive] rbalamohan commented on a diff in pull request #3174: HIVE-26110

GitBox Tue, 05 Apr 2022 01:42:01 -0700


rbalamohan commented on code in PR #3174:
URL: https://github.com/apache/hive/pull/3174#discussion_r842526145



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java:
##########
@@ -648,7 +648,12 @@ public ReduceSinkOperator getReduceSinkOp(List<Integer> 
partitionPositions, List
       ArrayList<ExprNodeDesc> partCols = Lists.newArrayList();
 
       for (Function<List<ExprNodeDesc>, ExprNodeDesc> customSortExpr : 
customSortExprs) {
-        keyCols.add(customSortExpr.apply(allCols));
+        ExprNodeDesc colExpr = customSortExpr.apply(allCols);
+        // Custom sort expressions are marked as KEYs, which is required for 
sorting the rows that are going for
+        // a particular reducer instance. They also need to be marked as 
'partition' columns for MapReduce shuffle
+        // phase, in order to gather the same keys to the same reducer 
instances.
+        keyCols.add(colExpr);
+        partCols.add(colExpr);

Review Comment:
   I didn't realise that entire "customSortExprs" copying was part of 
HIVE-25975. Should be fine in this case. 
   
   +1 pending tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hive] rbalamohan commented on a diff in pull request #3174: HIVE-26110

Reply via email to