[jira] [Commented] (MAPREDUCE-1060) JT should kill running maps when all the reducers have completed

Jonathan Eagles (JIRA) Wed, 27 Jul 2011 14:55:38 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072045#comment-13072045
 ]


Jonathan Eagles commented on MAPREDUCE-1060:
--------------------------------------------

Setup 1 - Run a modified word count program with failure induction built into 
the JobTracker. Restart the cluster everytime since the failure only works for 
the first mapreduce job run.

// See the failure
1. Apply the MAPREDUCE-1060-branch-0.20-security.manualtest.patch 
(build/install/start cluster)
2. Run hadoop jar hadoop-examples-0.20.206.0-SNAPSHOT.jar wordcount 
-Dmapred.reduce.tasks=1 /pride.txt /wc4 180000 (where pride.txt is the text of 
pride and prejudice from project gutenberg at 
http://www.gutenberg.org/ebooks/1342.txt.utf8) (note the last parameter 
instructs the restarted map task to sleep for 3 minutes so as to keep running 
for a long time after the reduces finish)
Verify the restarted map task runs a long time after the reduces have finished 
and that the original map task is marked as killed.

Setup 2 - Rerun the failure setup above with the supplied fix to show the patch 
fixes the issue
1. Apply the fix for the issue MAPREDUCE-1060-branch-0.20-security.patch 
(build/install/restart cluster)
2. Run hadoop jar hadoop-examples-0.20.206.0-SNAPSHOT.jar wordcount 
-Dmapred.reduce.tasks=1 /pride.txt /wc4 180000 (where pride.txt is the text of 
pride and prejudice from project gutenberg at 
http://www.gutenberg.org/ebooks/1342.txt.utf8) (note the last parameter 
instructs the restarted map task to sleep for 3 minutes so as to keep running 
for a long time after the reduces finish)
Verify the restarted map task is killed shortly after the reduces have finished 
and that the original map task is marked as succeeded




> JT should kill running maps when all the reducers have completed
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1060
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1060
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Jothi Padmanabhan
>            Assignee: Jonathan Eagles
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-1060-branch-0.20-security.manualtest.patch, 
> MAPREDUCE-1060-branch-0.20-security.patch
>
>
> We have seen some situations where maps are still running when all the 
> reducers have completed. This could happen because of lost TT's, interplay of 
> speculative tasks with bad TT's etc. If the maps take a long time to run, it 
> unnecessarily delays the job completion time, as this map output is not 
> required anyways. The JT should possibly kill running maps when all the 
> reducers have completed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-1060) JT should kill running maps when all the reducers have completed

Reply via email to