c21 commented on pull request #32328: URL: https://github.com/apache/spark/pull/32328#issuecomment-826601325
btw just some random thought here. E.g. for `LEFT OUTER JOIN` (build right side. stream left side), and we found left side is skewed and split the skewed partition `L1` into 3 smaller partitions (just an example here). Would the corresponding partition on right side `R1` needs to be built hash map 3 times for `L1`'s 3 smaller partitions? If that's true, it sounds to me that it may potentially introduce more OOM on build side, as tasks are sharing executor's off-heap memory to build hash maps? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
