cloud-fan commented on pull request #32550:
URL: https://github.com/apache/spark/pull/32550#issuecomment-848424672


   That's a good point. I think this is a general issue: if the planner creates 
shuffle hash join directly, we may also coalesce partitions and make the local 
hash map bigger.
   
   I don't really have a good idea here. If we do have many small partitions, 
coalesce partitions should be beneficial even for shuffle hash join. One idea 
is to use the existing config `advisoryPartitionSizeInBytes` in the new rule, 
then we don't need to worry about the partition size after coalescing becomes 
much larger than we expect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to