[
https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689809#action_12689809
]
schubert zhang commented on HADOOP-5367:
----------------------------------------
I am using branch-0.19, it seems fine.
> After some jobs have finished, Reducer will run new job's reduce tasks
> sequentially and not in parallel (mapred.JobTracker: Serious problem. While
> updating status, cannot find taskid...)
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5367
> URL: https://issues.apache.org/jira/browse/HADOOP-5367
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.19.1
> Environment: State: RUNNING
> Started: Fri Feb 27 17:00:07 CET 2009
> Version: 0.19.1, r745977
> Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
> Reporter: Thibaut
> Priority: Critical
>
> Hi,
> After I while, my cluster will only run the reduce tasks sequentially (each
> reducer running on the same node), the other nodes stay empty. The map phase
> however will run the jobs on all the nodes, also after such a "long" reduce
> phase has completed. But the reduce phase will then be again executed
> sequentially. This happens in my cluster after about 160 successfully
> completed jobs. (Some jobs have reducer set to 0!).
> As possible solution I have to restart the mapreduce service.
> I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0
> because of the multipleoutput bug when setting reducers to 0.
> Anoter site node which might be related. I also tried running the jobs with
> speculative execution set to on. My cluster would always hold back one
> reducer and only run it (in multiple instances) after the first of the other
> 6 reducers had finished, instead of launching all of them at the same time.
> Below is a short extract from related logfile. It's full of these kind of
> entries.
> 09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0051_r_000006_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0041_r_000002_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0083_r_000006_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0041_r_000005_1
> 09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0105_r_000006_1
> 09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0102_r_000006_1
> 09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0051_r_000006_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0041_r_000002_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0083_r_000006_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem. While updating
> status, cannot find taskid attempt_200902271700_0041_r_000005_1
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.