[jira] [Created] (HIVE-19480) Implement and Incorporate MAPREDUCE-207

BELUGA BEHR (JIRA) Wed, 09 May 2018 13:56:17 -0700

BELUGA BEHR created HIVE-19480:
----------------------------------

             Summary: Implement and Incorporate MAPREDUCE-207
                 Key: HIVE-19480
                 URL: https://issues.apache.org/jira/browse/HIVE-19480
             Project: Hive
          Issue Type: New Feature
          Components: HiveServer2
    Affects Versions: 3.0.0
            Reporter: BELUGA BEHR



* HiveServer2 has the ability to run many MapReduce jobs in parallel.
 * Each MapReduce application calculates the job's file splits at the client 
level
 * = HiveServer2 loading many file splits at the same time, putting pressure on 
memory

{quote}"The client running the job calculates the splits for the job by calling 
getSplits(), then sends them to the application master, which uses their 
storage locations to schedule map tasks that will process them on the cluster."
 - "Hadoop: The Definitive Guide"{quote}
MAPREDUCE-207 should address this memory pressure by moving split calculations 
into ApplicationMaster. Spark and Tez already take this approach.

Once MAPREDUCE-207 is completed, leverage the capability in HiveServer2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19480) Implement and Incorporate MAPREDUCE-207

Reply via email to