[ 
https://issues.apache.org/jira/browse/HADOOP-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644645#action_12644645
 ] 

Vivek Ratan commented on HADOOP-4445:
-------------------------------------

Hemanth and I looked at what's going on here. Essentially, there are two 
sources of truth, regarding the number of running tasks in the system. Each 
JobInProgress object maintains counts of running map and reduce tasks. These 
counts are incremented when a task is assigned to a TT (in obtainNewMapTask() 
or obtainNewReduceTask()). These counts are used the by the CapacityScheduler . 
The cluster summary, represented by the ClusterStatus object, also contains 
counts of the total number of maps and reduce tasks. These are incremented by 
the JT using the TT status. The counts maintained by the JobInProgress objects 
and the ClusterStatus object, are off by a heartbeat. The former increments its 
counts when a task is assigned. Once the task runs on a TT, its running status 
is conveyed to the JT in the TT's next heartbeat. During startup, a lot of TTs 
approach the JT for tasks to run. As a result, the counts of running tasks 
across all JobInProgress objects are much higher than the cluster count, since 
the cluster count is updated only when the TTs report their status in their 
next hearbeat. That explains the discrepancy reported in this Jira. In steady 
state, these two counts are mostly identical, or off by a little bit, as TTs 
finish their tasks at different times. 

This is not really a bug, as it's not clear which count is 'correct'. We're 
reporting from two different sources: the cluster summary and the Scheduler 
(which gets it info from the JobInProgress objects). But different numbers do 
get reflected in the UI. So the best fix is to probably indicate in the  
Scheduler part of the UI that its computation is off from the cluster summary 
by a heartbeat. Maybe a little explanation in the bottom that says something 
like: "This info varies from that of the cluster summary by a heartbeat". 

I don't think we should change anything in the scheduler or the cluster 
summary. They're both doing the right thing their own way. An alternate 
solution is to have the cluster summary use the counts from the JobInProgress 
objects, but this is performance-intensive, and was presumably the reason why 
the cluster summary maintains its own count. 

You do want to the leave the rest of the UI as is. The cluster summary is 
useful, as is the per-queue information of running tasks (reported by the 
Scheduler) as it lets users know whether the queue is running above/at/below 
its guaranteed capacity. 

bq. Hence, the waiting counts should be removed from the scheduler information.
The scheduler maintains  a partial waiting count of map/reduce tasks. It 
doesn't need to know the total number of pending tasks if this total is larger 
than the cluster capacity. So, for performance reasons, it only counts up to 
the cluster capacity. HADOOP-4576 has been opened for this purpose and suggests 
that we display pending jobs instead of pending tasks, as the former seems more 
useful to users. 


> Wrong number of running map/reduce tasks are displayed in queue information.
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-4445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4445
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>         Environment: Hadoop r705159, Queue=default, GC=100% 
> MapCapacity=ReduceCapacity=212
>            Reporter: Karam Singh
>            Assignee: Sreekanth Ramakrishnan
>
> Wrong number of running map/reduce tasks are displayed in queue information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to