c21 commented on a change in pull request #32328:
URL: https://github.com/apache/spark/pull/32328#discussion_r621850427
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##########
@@ -148,7 +148,7 @@ object OptimizeSkewedJoin extends CustomShuffleReaderRule {
/*
* This method aim to optimize the skewed join with the following steps:
Review comment:
My hunch is when the build side can be potentially OOM-ed, it should
already be considered as skewed. So after AQE skew handling, some of
potentially OOM-ed build side (inner join only) can be avoided.
However, for queries with other join types, queries not having shuffle
before join, and queries with run-time hash map being significantly larger than
partition size, we should have run-time fallback mechanism in shuffled hash
join itself. This PR and #32210 should be good to have and orthogonal to each
other.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]