[jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]

Szehon Ho (JIRA) Mon, 01 Dec 2014 17:35:29 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Szehon Ho updated HIVE-8943:
----------------------------
    Attachment: HIVE-8943-4.spark.branch

Fix algorithm and cleanup after discussion with Xuefu.  Original code was too 
aggressively incorporating connected mapjoins into its size calculation, new 
code only looks at the big table's connected mapjoins.

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-8943
>                 URL: https://issues.apache.org/jira/browse/HIVE-8943
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-8943-4.spark.patch, HIVE-8943.1-spark.patch, 
> HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch, HIVE-8943.3-spark.patch
>
>
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer does combine nested mapjoins into one work due to 
> removal of RS for big-table.  So we need to enhance the check to calculate if 
> all the MapJoins in that work (spark-stage) will fit into the memory, 
> otherwise it might overwhelm memory for that particular spark executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]

Reply via email to