[ 
https://issues.apache.org/jira/browse/HADOOP-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547625
 ] 

Owen O'Malley commented on HADOOP-2327:
---------------------------------------

I think this should more general that streaming and be in the general 
framework. I'd propose a solution that allows the job to control which maps and 
reduces are run by number using two configuration parameters. 

-Dmapred.map.only-run=1,20-100,103
-Dmapred.reduce.only-run=4

would run map1, 20, 21, ..., 100, and 103 and reduce 4.

Would that meet your requirements, Arkady?


> Streaming: need to be able to re-run specific map tasks (when -reducer NONE)
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2327
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2327
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: arkady borkovsky
>
> Sometimes, a few map tasks fail and -reducer NONE.  
> It should be possible to rerun the failed map tasks .
> There are several failure modes:
>    * a task is hanging, so the job is killed
>    * from the infrastructure perspective, the task has completed successfully 
> , but it failed to produces correct result
>    * failed in the proper Hadoop sense
> It is often too expensive to rerun the whole job.  And for larger jobs, 
> chances are each run will have a few failed tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to