[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikayla Konst updated MAPREDUCE-7173:
-------------------------------------
    Attachment: MAPREDUCE-7173.patch
        Status: Patch Available  (was: Open)

> Add ability to shuffle intermediate map task output to a distributed 
> filesystem
> -------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7173
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7173
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2
>    Affects Versions: 2.9.2
>            Reporter: Mikayla Konst
>            Priority: Major
>         Attachments: MAPREDUCE-7173.patch
>
>
> If nodes are lost during the course of a mapreduce job, the map tasks that 
> ran on those nodes need to be re-run. Writing intermediate map task output to 
> a distributed file system eliminates this problem in environments in which 
> nodes are frequently lost, for example, in clusters that make heavy use of 
> Google's Preemptible VMs or AWS's Spot Instances.
> *Example Usage:*
> *Job-scoped properties:*
> 1. Don't re-run an already-finished map task when we realize the node it ran 
> on is now unusable:
> mapreduce.map.rerun-if-node-unusable=false (see MAPREDUCE-7168)
> 2. On the map side, use a new implementation of MapOutputFile that provides 
> paths relative to the staging dir for the job (which is cleaned up when the 
> job is done):
> mapreduce.task.general.output.class=org.apache.hadoop.mapred.HCFSOutputFiles
> 3. On the reduce side, use a new implementation of ShuffleConsumerPlugin that 
> fetches map task output directly from a distributed filesystem:
> mapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.HCFSShuffle
> 4. (Optional) Edit the buffer size for the output stream used when writing 
> map task output
> mapreduce.map.shuffle.output.buffer.size=8192
> *Cluster-scoped properties* (see YARN-9106):
> 1. When gracefully decommissioning a node, only wait for the containers on 
> that node to finish, not the applications associated with those containers 
> (we don't need to wait on the applications to finish since this node is not 
> serving shuffle data)
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications=false
> 2. When gracefully decommissioning a node, do not wait for app masters 
> running on the node to finish so that this node can be decommissioned as soon 
> as possible (failover to an app master on another node that isn't being 
> decommissioned is pretty quick)
> yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters=false



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to