[jira] [Created] (MAPREDUCE-7173) Add ability to shuffle intermediate map task output to a distributed filesystem

Mikayla Konst (JIRA) Wed, 19 Dec 2018 18:14:33 -0800

Mikayla Konst created MAPREDUCE-7173:
----------------------------------------


             Summary: Add ability to shuffle intermediate map task output to a 
distributed filesystem
                 Key: MAPREDUCE-7173
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7173
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
          Components: mrv2
    Affects Versions: 2.9.2
            Reporter: Mikayla Konst


If nodes are lost during the course of a mapreduce job, the map tasks that ran 
on those nodes need to be re-run. Writing intermediate map task output to a 
distributed file system eliminates this problem in environments in which nodes 
are frequently lost, for example, in clusters that make heavy use of Google's 
Preemptible VMs or AWS's Spot Instances.

*Example Usage:*

*Job-scoped properties:*

1. Don't re-run an already-finished map task when we realize the node it ran on 
is now unusable:

mapreduce.map.rerun-if-node-unusable=false (see MAPREDUCE-7168)

2. On the map side, use a new implementation of MapOutputFile that provides 
paths relative to the staging dir for the job (which is cleaned up when the job 
is done):

mapreduce.task.general.output.class=org.apache.hadoop.mapred.HCFSOutputFiles

3. On the reduce side, use a new implementation of ShuffleConsumerPlugin that 
fetches map task output directly from a distributed filesystem:

mapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.HCFSShuffle

4. (Optional) Edit the buffer size for the output stream used when writing map 
task output

mapreduce.map.shuffle.output.buffer.size=8192

*Cluster-scoped properties* (see YARN-9106):

1. When gracefully decommissioning a node, only wait for the containers on that 
node to finish, not the applications associated with those containers (we don't 
need to wait on the applications to finish since this node is not serving 
shuffle data)

yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications=false

2. When gracefully decommissioning a node, do not wait for app masters running 
on the node to finish so that this node can be decommissioned as soon as 
possible (failover to an app master on another node that isn't being 
decommissioned is pretty quick)

yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters=false



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (MAPREDUCE-7173) Add ability to shuffle intermediate map task output to a distributed filesystem

Reply via email to