[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213567#comment-13213567
 ] 

Liyin Liang commented on MAPREDUCE-3892:
----------------------------------------

If a non-speculative and speculative tasks are launched for a map, 
runningMapTasks will increase twice. 
{code:}
  synchronized void addRunningTaskToTIP(TaskInProgress tip, TaskAttemptID id, 
                                        TaskTrackerStatus tts, 
                                        boolean isScheduled) {
...
    } else if (tip.isMapTask()) {
      ++runningMapTasks;
...
}
{code}
When the non-speculative task complete, runningMapTasks will decrease one.
{code:}
  public synchronized boolean completedTask(TaskInProgress tip, 
                                            TaskStatus status)
{
...
    } else if (tip.isMapTask()) {
      runningMapTasks -= 1;
...
}
{code}
Then the speculative task is killed and marked as KILLED_UNCLEAN and the 
task-cleanup task is launched.
When the task-cleanup task is complete, runningMapTasks will decrease one again.
{code:}
  private void failedTask(TaskInProgress tip, TaskAttemptID taskid, 
                          TaskStatus status, 
                          TaskTracker taskTracker, boolean wasRunning,
                          boolean wasComplete, boolean wasAttemptRunning) {
...
        if (tip.isMapTask() && !metricsDone) {
          runningMapTasks -= 1;
...
}
{code}

However, if the tasktracker on which the task-cleanup task is running fails, 
runningMapTasks will not decrease. This is because:
{code:}
  void lostTaskTracker(TaskTracker taskTracker) {
...
        if (!tip.isComplete() || 
            (tip.isMapTask() && !tip.isJobSetupTask() && 
             job.desiredReduces() != 0)) {
...
}
{code}
In this case, *tip.isComplete()* returns true and *job.desireReduces()* equals 
zero(map-only job). This means that for this map task , we only decreased 
runningMapTasks once , whereas we increased it twice. As a result, 
runningMapTasks counter is incorrect.

                
> runningMapTasks counter is not decremented in case of failed task-cleanup 
> tasks.
> --------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3892
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3892
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 1.0.0
>         Environment: Hadoop 1.0.0
>            Reporter: Liyin Liang
>
> For a map-only job, a map task has two running attempts: attempt_0 and 
> attempt_1 (speculative task). If attempt_0 completes first, then this map 
> task is complete and attempt_1 is killed. Also a task-cleanup task will be 
> launched to clean attempt_1's output. If the TaskTracker is lost while the 
> task-cleanup task is running, the attempt_1 will remain KILLED_UNCLEAN 
> status. Whats' more runningMapTasks equals one instead of zero after the job 
> is finished.
> The incorrect runningMapTasks value can lead to bad scheduling decisions with 
> our scheduler. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to