okumin commented on code in PR #5406:
URL: https://github.com/apache/hive/pull/5406#discussion_r1731112018


##########
ql/src/test/queries/clientpositive/auto_sortmerge_join_18.q:
##########
@@ -0,0 +1,39 @@
+CREATE TABLE t_asj_18 (k STRING, v INT);

Review Comment:
   I confirmed this test case reproduces the issue explained in the JIRA ticket



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java:
##########
@@ -870,6 +871,35 @@ private boolean checkConvertJoinSMBJoin(JoinOperator 
joinOp, OptimizeTezProcCont
       }
     }
 
+    /* As SMB replaces last RS op from the joining branches and the JOIN op 
with MERGEJOIN, we need to ensure
+     * the RS before these RS, in both branches, are partitioning using same 
hash generator. It
+     * differs depending on ReducerTraits.UNIFORM i.e. 
ReduceSinkOperator#computeMurmurHash or
+     * ReduceSinkOperator#computeHashCode, leading to different code for same 
value. Skip SMB join in such cases.
+     */
+    Boolean prevRsHasUniformTrait = null;
+    for (Operator<? extends OperatorDesc> parentOp : 
joinOp.getParentOperators()) {
+      // Assertion of mandatory single parent is already being done in bucket 
version check earlier
+      Operator<?> op = parentOp.getParentOperators().get(0);

Review Comment:
   Can I assume we want to find ReduceSinkOperator-1 of `Operators -> 
ReduceSinkOperator-1 -> Operators -> ReduceSinkOperator-2(parentOp) 
->JoinOperator`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to