[
https://issues.apache.org/jira/browse/HIVE-19480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gopal V updated HIVE-19480:
---------------------------
Affects Version/s: (was: 3.0.0)
1.2.3
> Implement and Incorporate MAPREDUCE-207
> ---------------------------------------
>
> Key: HIVE-19480
> URL: https://issues.apache.org/jira/browse/HIVE-19480
> Project: Hive
> Issue Type: New Feature
> Components: HiveServer2
> Affects Versions: 1.2.3
> Reporter: BELUGA BEHR
> Priority: Major
>
> * HiveServer2 has the ability to run many MapReduce jobs in parallel.
> * Each MapReduce application calculates the job's file splits at the client
> level
> * = HiveServer2 loading many file splits at the same time, putting pressure
> on memory
> {quote}"The client running the job calculates the splits for the job by
> calling getSplits(), then sends them to the application master, which uses
> their storage locations to schedule map tasks that will process them on the
> cluster."
> - "Hadoop: The Definitive Guide"{quote}
> MAPREDUCE-207 should address this memory pressure by moving split
> calculations into ApplicationMaster. Spark and Tez already take this approach.
> Once MAPREDUCE-207 is completed, leverage the capability in HiveServer2.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)