ulysses-you commented on pull request #32391: URL: https://github.com/apache/spark/pull/32391#issuecomment-845603097
@Gabriel39 I guess you misunderstand the logic of AQE. > AQE should not optimize it to other join type since static stats (e.g sizeInBytes) is always larger or equal the actual value That's wrong, AQE can never change a BHJ to other join strategy which is decided at normal planner side. It's not about the stats, you can see some key code in `LogicalQueryStageStrategy`. And this new config is assuming a join is not a BHJ before AQE, so that AQE can use the new config and runtime stats to make a join (mostly is SMJ) as BHJ. So, usually the right way of using this new config is 1) forbid the normal auto broadcast or reduce the value 2) tune the new config value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
