[ 
https://issues.apache.org/jira/browse/HADOOP-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689809#action_12689809
 ] 

schubert zhang commented on HADOOP-5367:
----------------------------------------

I am using branch-0.19, it seems fine.

> After some jobs have finished, Reducer will run new job's reduce tasks 
> sequentially and not in parallel (mapred.JobTracker: Serious problem.  While 
> updating status, cannot find taskid...)
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5367
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5367
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.1
>         Environment: State: RUNNING
> Started: Fri Feb 27 17:00:07 CET 2009
> Version: 0.19.1, r745977
> Compiled: Fri Feb 20 00:16:34 UTC 2009 by ndaley
>            Reporter: Thibaut
>            Priority: Critical
>
> Hi,
> After I while, my cluster will only run the reduce tasks sequentially (each 
> reducer running on the same node), the other nodes stay empty. The map phase 
> however will run the jobs on all the nodes, also after such a "long" reduce 
> phase has completed. But the reduce phase will then be again executed 
> sequentially. This happens in my cluster after about 160 successfully 
> completed jobs. (Some jobs have reducer set to 0!). 
> As possible solution I have to restart the mapreduce service.
> I didn't notice this behaviour in version 0.19.0. I can't use version 0.19.0 
> because of the multipleoutput bug when setting reducers to 0.
> Anoter site node which might be related. I also tried running the jobs with 
> speculative execution set to on. My cluster would always hold back one 
> reducer and only run it (in multiple instances) after the first of the other 
> 6 reducers had finished, instead of launching all of them at the same time.
> Below is a short extract from related logfile. It's full of these kind of 
> entries.
> 09/02/28 12:48:07 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0051_r_000006_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0041_r_000002_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0083_r_000006_1
> 09/02/28 12:48:08 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0041_r_000005_1
> 09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0105_r_000006_1
> 09/02/28 12:48:10 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0102_r_000006_1
> 09/02/28 12:48:12 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0051_r_000006_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0041_r_000002_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0083_r_000006_1
> 09/02/28 12:48:13 INFO mapred.JobTracker: Serious problem.  While updating 
> status, cannot find taskid attempt_200902271700_0041_r_000005_1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to