ngsg commented on code in PR #5707:
URL: https://github.com/apache/hive/pull/5707#discussion_r2017799177
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java:
##########
@@ -182,6 +187,30 @@ public static ReduceWork createReduceWork(
return reduceWork;
}
+ private static boolean hasBucketMapJoin(Operator<? extends OperatorDesc>
operator) {
+ if (operator == null) {
+ return false;
+ }
+
+ // Iterate over child operators
+ for (Operator<? extends OperatorDesc> childOp :
operator.getChildOperators()) {
+ // Check if this is a MapJoinOperator and is a Bucket Map Join
+ if (childOp instanceof MapJoinOperator) {
+ MapJoinOperator mjOp = (MapJoinOperator) childOp;
+ if (mjOp.getConf().isBucketMapJoin()) {
+ return true; // Found BMJ, no need to check further
+ }
+ }
+
+ // Recursively check children
+ if (hasBucketMapJoin(childOp)) {
+ return true;
+ }
Review Comment:
`createReduceWork()` is called whenever it meets a new root operator that
has any proceeding work(=vertex), so the number of RS operators and
`createReduceWork()` calls is not equal.
In the opeator graph `RS[7] -> GBY[8] -> RS[11] -> MAPJOIN[49]`, the
`GBY[8]` would trigger `createReduceWork()` since it is a root operator and it
has a proceeding work `TS?-...-RS[7]`. However, `MapJoin[49]` would not trigger
`createReduceWork()` as it is not a root operator; it would belong to another
Map work something like `TS??-...-MapJoin[49]`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]