[jira] [Updated] (MAPREDUCE-4030) If the nodemanager on which the maptask is executed is going down before the mapoutput is consumed by the reducer,then the job is failing with shuffle error

Robert Joseph Evans (Updated) (JIRA) Wed, 18 Apr 2012 13:27:08 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Robert Joseph Evans updated MAPREDUCE-4030:
-------------------------------------------

    Target Version/s: 0.23.3, 2.0.0, 3.0.0  (was: 0.23.2)
    
> If the nodemanager on which the maptask is executed is going down before the 
> mapoutput is consumed by the reducer,then the job is failing with shuffle 
> error
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4030
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4030
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Nishan Shetty
>            Assignee: Devaraj K
>
> My cluster has 2 NM's.
> The value of "mapreduce.job.reduce.slowstart.completedmaps" is set to 1.
> When the job execution is in progress and Mappers has finished about 99% 
> completion,one of the NM has gone down.
> The job has failed with the following trace
> "Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error 
> in shuffle in fetcher#1 at 
> org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:123) at 
> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:371) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:148) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:396) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:143) Caused by: 
> java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
>  at 
> org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
>  at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:240)
>  at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:152) "

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4030) If the nodemanager on which the maptask is executed is going down before the mapoutput is consumed by the reducer,then the job is failing with shuffle error

Reply via email to