[
https://issues.apache.org/jira/browse/HADOOP-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688271#action_12688271
]
Amareshwari Sriramadasu commented on HADOOP-5548:
-------------------------------------------------
As Devaraj pointed out, problem is not with JobTracker restart.
In JobTracker, TaskTrackerStatus is cached in {{taskTrackers}} and is supposed
to be read-only. But it is passed to updateTaskStatuses() method, in which task
reports (TaskStatus objects) are passed to JobInProgress. In
JobInProgress.updaTaskStatuses() and tip.updateStatus(), the TaskStatus object
is getting modified.
The code in TaskInProgress modifying the TaskStatus reference :
{code}
if (!isCleanupAttempt(taskid)) {
taskStatuses.put(taskid, status);
} else {
taskStatuses.get(taskid).statusUpdate(status.getRunState(),
status.getProgress(), status.getStateString(), status.getPhase(),
status.getFinishTime());
}
{code}
This could make total count negative in following scenario:
Tracker1 reported a task *t_0* is KILLED_UNCLEAN.
Tracker2 is given the cleanup attempt for t_0.
Tracker2 reports saying it is running cleanup attempt t_0. Updates taskStatuses
object, which is holding TaskStatus object from tracker1's status.
JT calculates total count assuming the task is run by both the trackers, thus
leading to negative totals.
Cloning TaskStatus object and passing to JIP looks like the correct solution.
Thoughts?
> Observed negative running maps on the job tracker
> -------------------------------------------------
>
> Key: HADOOP-5548
> URL: https://issues.apache.org/jira/browse/HADOOP-5548
> Project: Hadoop Core
> Issue Type: Bug
> Affects Versions: 0.20.0
> Reporter: Owen O'Malley
> Assignee: Amareshwari Sriramadasu
> Priority: Blocker
>
> We saw in both the web/ui and cli tools:
> {noformat}
> Cluster Summary (Heap Size is 11.7 GB/13.37 GB)
> Maps Reduces Total Nodes Map Task Reduce Task Avg. Blacklisted
> Submissions Capacity Capacity Tasks/Node Nodes
> -971 0 133 434 1736 1736 8.00 0
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.