c21 commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-826250121
> Apparently falling back to SMJ wastes the partially-built hash map. Note the fall back only happens when there's no memory available to add new entry to hash map. So even without enabling shuffled hash join by default for all queries, from my point of view, this is still an improvement for reliability right now if people selectively enable shuffled hash join for certain queries. People do not need to worry about OOM for shuffled hash join. If we think this is not the right direction to go, I think same argument going with hash aggregate fallback. We enable hash aggregate by default, and if the query keeps falling back from hash aggregate to sort-based aggregate, then this is not as efficient as just using sort aggregate for the query. > If one partition is a bit larger to build the in-memory hash map, I feel spilling the hash map might be a better choice? Per my knowledge I don't know any obviously efficient way to do random lookup join will spilled hash map. How do we minimize random disk read for spilled map? Happy to brainstorm more if you have some rough ideas. > If one partition is much larger to build the in-memory hash map, seems we can use the same technique of skew join handling, to split the partition into multiple smaller ones so that they can fit in memory. Again I feel this goes back to the AQE hybrid join discussion. The same disadvantage applies here - only work for join with shuffle. We cannot reliably enable shuffled hash join then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
