c21 commented on a change in pull request #30280:
URL: https://github.com/apache/spark/pull/30280#discussion_r518928747



##########
File path: 
sql/core/src/test/resources/sql-tests/inputs/subquery/in-subquery/in-joins.sql
##########
@@ -6,8 +6,8 @@
 --  2. run with whole-stage-codegen, operator codegen or no codegen.
 
 --CONFIG_DIM1 spark.sql.autoBroadcastJoinThreshold=10485760
---CONFIG_DIM1 
spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=true
---CONFIG_DIM1 
spark.sql.autoBroadcastJoinThreshold=-1,spark.sql.join.preferSortMergeJoin=false
+--CONFIG_DIM1 
spark.sql.autoBroadcastJoinThreshold=10485760,spark.sql.join.preferSortMergeJoin=true
+--CONFIG_DIM1 
spark.sql.autoBroadcastJoinThreshold=10485760,spark.sql.join.preferSortMergeJoin=false

Review comment:
       @warrenzhu25 - Shuffled hash join will only be enabled with proper 
config value for `spark.sql.autoBroadcastJoinThreshold` and 
`spark.sql.shuffle.partitions`, and one side should be 3x smaller compared to 
the other side 
([code](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L368-L381)).
 I don't think test cases here satisfy the second condition (one side 3x 
smaller than the other side). Can you double check the query plan? Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to