himanshu-mishra commented on code in PR #5406:
URL: https://github.com/apache/hive/pull/5406#discussion_r1738034248
##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:
##########
@@ -870,6 +871,35 @@ private boolean checkConvertJoinSMBJoin(JoinOperator
joinOp, OptimizeTezProcCont
}
}
+ /* As SMB replaces last RS op from the joining branches and the JOIN op
with MERGEJOIN, we need to ensure
+ * the RS before these RS, in both branches, are partitioning using same
hash generator. It
+ * differs depending on ReducerTraits.UNIFORM i.e.
ReduceSinkOperator#computeMurmurHash or
+ * ReduceSinkOperator#computeHashCode, leading to different code for same
value. Skip SMB join in such cases.
+ */
+ Boolean prevRsHasUniformTrait = null;
+ for (Operator<? extends OperatorDesc> parentOp :
joinOp.getParentOperators()) {
+ // Assertion of mandatory single parent is already being done in bucket
version check earlier
+ Operator<?> op = parentOp.getParentOperators().get(0);
+ while (op != null && !(op instanceof TableScanOperator || op instanceof
ReduceSinkOperator
+ || op instanceof CommonJoinOperator)) {
+ // If op has parents it is guaranteed to be 1.
+ List<Operator<?>> parents = op.getParentOperators();
+ Preconditions.checkState(parents.size() == 0 || parents.size() == 1);
+ op = parents.size() == 1 ? parents.get(0) : null;
+ }
Review Comment:
There isn't a utility method there that returns first ancestor operator
encountered from given set of classes. Here, we are only interested in ancestor
RS if that is the source of data for current RS, if TS, JOIN are encountered,
they are the source, and we do not need verifying them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]