amansinha100 commented on code in PR #4190:
URL: https://github.com/apache/hive/pull/4190#discussion_r1170887837


##########
ql/src/test/queries/clientpositive/antijoin2.q:
##########
@@ -0,0 +1,75 @@
+set hive.merge.nway.joins=false;
+set hive.vectorized.execution.enabled=false;
+set hive.auto.convert.join=true;
+set hive.auto.convert.anti.join=true;
+
+drop table if exists tt1;
+drop table if exists tt2;
+drop table if exists tt3;
+
+create table tt1 (ws_order_number bigint, ws_ext_ship_cost decimal(7, 2));
+create table tt2 (ws_order_number bigint);
+create table tt3 (wr_order_number bigint);
+
+insert into tt1 values (42, 3093.96), (1041, 299.28), (1378, 85.56), (1378, 
719.44), (1395, 145.68);
+insert into tt2 values (1378), (1395);
+insert into tt3 values (42), (1041);
+
+-- The result should be the same regardless of vectorization.
+
+explain

Review Comment:
   The plans for these queries don't show the pattern of MergeJoin --> MapJoin. 
 Without this pattern, as I noted in the Jira, the wrong results issue was not 
reproducible.  In order to force this plan, I had to set the following stats:
   alter table tt1 update statistics set ('numRows'='10000000');
   alter table tt2 update statistics set ('numRows'='10000000');
   alter table tt3 update statistics set ('numRows'='2');
   Could you add/modify the existing tests to set these stats and generate the 
MergeJoin-->MapJoin plan and verify that the fix works for that ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to