ulysses-you commented on pull request #32391:
URL: https://github.com/apache/spark/pull/32391#issuecomment-845603097


   @Gabriel39 I guess you misunderstand the logic of AQE.
   
   > AQE should not optimize it to other join type since static stats (e.g 
sizeInBytes) is always larger or equal the actual value
   
   That's wrong, AQE can never change a BHJ to other join strategy which is 
decided at normal planner side. It's not about the stats, you can see some key 
code in `LogicalQueryStageStrategy`.
   
   And this new config is assuming a join is not a BHJ before AQE, so that AQE 
can use the new config and runtime stats to make a join (mostly is SMJ) as BHJ.
   
   So, usually the right way of using this new config is 1) forbid the normal 
auto broadcast or reduce the value 2) tune the new config value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to