ngsg commented on code in PR #5717:
URL: https://github.com/apache/hive/pull/5717#discussion_r2071492206


##########
ql/src/test/queries/clientpositive/sharedwork_mapjoin_datasize_check.q:
##########
@@ -0,0 +1,66 @@
+--! qt:dataset:src
+--! qt:dataset:src1
+
+set hive.auto.convert.join=true;
+set hive.llap.mapjoin.memory.oversubscribe.factor=0;
+set hive.auto.convert.join.noconditionaltask.size=500;
+
+-- The InMemoryDataSize of MapJoin is 280. Therefore, SWO should not merge 2 
TSs reading src
+-- as the sum of InMemoryDataSize of 2 unmerged MapJoin exceeds 500.

Review Comment:
   Before the patch, `SharedWorkOptimizer` merges two TableScan operators, 
which results in fewer Map vertices in the explained plan.
   
   I also attached the Tez vertex dependency from the original qfile output for 
your understanding:
   ```
         Edges:
           Map 1 <- Map 4 (BROADCAST_EDGE), Reducer 5 (BROADCAST_EDGE)
           Reducer 2 <- Map 1 (SIMPLE_EDGE), Reducer 3 (BROADCAST_EDGE)
           Reducer 3 <- Map 1 (SIMPLE_EDGE)
           Reducer 5 <- Map 4 (SIMPLE_EDGE)
   #### A masked pattern was here ####
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to