[ 
https://issues.apache.org/jira/browse/MAPREDUCE-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756351#action_12756351
 ] 

Jothi Padmanabhan commented on MAPREDUCE-969:
---------------------------------------------

bq. 'll apply the second patch from HADOOP-4744 on this cluster and report back 
whether it solves the problem

Thanks. Also, if we are able to reproduce the TT returning -1 port more often, 
I think we should get the Jetty folks involved again. The patch in H-4744 is 
more of a work around rather than a solution.  

> NullPointerException during reduce freezes job
> ----------------------------------------------
>
>                 Key: MAPREDUCE-969
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-969
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker, task, tasktracker
>    Affects Versions: 0.20.2
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: bad_job_events, bad_job_jt_logs, reduce_task_logs
>
>
> We experienced several jobs stuck in Reduce on a cluster. All of the stuck 
> reduce tasks had a similar were stuck at "Need another 2 map output(s) where 
> 0 is already in progress" despite all of the mappers having completed, and 0 
> scheduled. The stuck reducers had experienced the following exception early 
> in the shuffle:
> java.lang.NullPointerException
>       at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2747)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2670)
> Will attach more information and logs momentarily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to