[ https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Szehon Ho updated HIVE-8943: ---------------------------- Attachment: HIVE-8943-4.spark.branch Fix algorithm and cleanup after discussion with Xuefu. Original code was too aggressively incorporating connected mapjoins into its size calculation, new code only looks at the big table's connected mapjoins. > Fix memory limit check for combine nested mapjoins [Spark Branch] > ----------------------------------------------------------------- > > Key: HIVE-8943 > URL: https://issues.apache.org/jira/browse/HIVE-8943 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Szehon Ho > Assignee: Szehon Ho > Attachments: HIVE-8943-4.spark.patch, HIVE-8943.1-spark.patch, > HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch, HIVE-8943.3-spark.patch > > > Its the opposite problem of what we thought in HIVE-8701. > SparkMapJoinOptimizer does combine nested mapjoins into one work due to > removal of RS for big-table. So we need to enhance the check to calculate if > all the MapJoins in that work (spark-stage) will fit into the memory, > otherwise it might overwhelm memory for that particular spark executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)