[
https://issues.apache.org/jira/browse/MAPREDUCE-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753968#action_12753968
]
Todd Lipcon commented on MAPREDUCE-969:
---------------------------------------
Yea, I looked at HADOOP-4744 as well as a couple other JIRAs but wasn't able to
figure it out. If it keeps popping up, we will instrument the code with some
debug logging and see if we can track it down.
> NullPointerException during reduce freezes job
> ----------------------------------------------
>
> Key: MAPREDUCE-969
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-969
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker, task, tasktracker
> Affects Versions: 0.20.2
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: bad_job_events, bad_job_jt_logs, reduce_task_logs
>
>
> We experienced several jobs stuck in Reduce on a cluster. All of the stuck
> reduce tasks had a similar were stuck at "Need another 2 map output(s) where
> 0 is already in progress" despite all of the mappers having completed, and 0
> scheduled. The stuck reducers had experienced the following exception early
> in the shuffle:
> java.lang.NullPointerException
> at
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2747)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2670)
> Will attach more information and logs momentarily.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.