[jira] [Updated] (MAPREDUCE-7168) Add option to not kill already-done map tasks when node becomes unusable

2018-11-30 Thread Mikayla Konst (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikayla Konst updated MAPREDUCE-7168:
-
Attachment: MAPREDUCE-7168.patch

> Add option to not kill already-done map tasks when node becomes unusable
> 
>
> Key: MAPREDUCE-7168
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7168
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Affects Versions: 2.9.2
> Environment: Google Compute Engine (Dataproc), Java 8
>Reporter: Mikayla Konst
>Priority: Minor
> Attachments: MAPREDUCE-7168.patch
>
>
> When a node becomes unusable, if there are still reduce tasks running, all 
> completed map tasks that were run on that node are killed so that they can be 
> re-run on a different node. This is because the node can no longer serve 
> shuffle data, so the map task output cannot be fetched by the reducers.
> If map tasks do not write their shuffle data locally, killing already-done 
> map tasks will make the job lose map progress unnecessarily. This change 
> prevents map progress from being lost when shuffle data is not written 
> locally by providing a property mapreduce.map.rerun-if-node-unusable that can 
> be set to false to prevent killing already-done map tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7168) Add option to not kill already-done map tasks when node becomes unusable

2018-11-30 Thread Mikayla Konst (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikayla Konst updated MAPREDUCE-7168:
-
Attachment: (was: MAPREDUCE-7168.patch)

> Add option to not kill already-done map tasks when node becomes unusable
> 
>
> Key: MAPREDUCE-7168
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7168
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Affects Versions: 2.9.2
> Environment: Google Compute Engine (Dataproc), Java 8
>Reporter: Mikayla Konst
>Priority: Minor
> Attachments: MAPREDUCE-7168.patch
>
>
> When a node becomes unusable, if there are still reduce tasks running, all 
> completed map tasks that were run on that node are killed so that they can be 
> re-run on a different node. This is because the node can no longer serve 
> shuffle data, so the map task output cannot be fetched by the reducers.
> If map tasks do not write their shuffle data locally, killing already-done 
> map tasks will make the job lose map progress unnecessarily. This change 
> prevents map progress from being lost when shuffle data is not written 
> locally by providing a property mapreduce.map.rerun-if-node-unusable that can 
> be set to false to prevent killing already-done map tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-7168) Add option to not kill already-done map tasks when node becomes unusable

2018-11-30 Thread Mikayla Konst (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikayla Konst updated MAPREDUCE-7168:
-
Attachment: MAPREDUCE-7168.patch
Status: Patch Available  (was: Open)

> Add option to not kill already-done map tasks when node becomes unusable
> 
>
> Key: MAPREDUCE-7168
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7168
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: mrv2
>Affects Versions: 2.9.2
> Environment: Google Compute Engine (Dataproc), Java 8
>Reporter: Mikayla Konst
>Priority: Minor
> Attachments: MAPREDUCE-7168.patch
>
>
> When a node becomes unusable, if there are still reduce tasks running, all 
> completed map tasks that were run on that node are killed so that they can be 
> re-run on a different node. This is because the node can no longer serve 
> shuffle data, so the map task output cannot be fetched by the reducers.
> If map tasks do not write their shuffle data locally, killing already-done 
> map tasks will make the job lose map progress unnecessarily. This change 
> prevents map progress from being lost when shuffle data is not written 
> locally by providing a property mapreduce.map.rerun-if-node-unusable that can 
> be set to false to prevent killing already-done map tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org