[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800120#action_12800120
 ] 

Amar Kamat commented on MAPREDUCE-1316:
---------------------------------------

Arun, the logging changes will help in debugging memory leak issues caused 
because of stale references of TaskInProgress objects. The log changes are such 
that one log-line indicating task removal will be printed once per task. This 
is in sync with the task addition log-line and hence any mismatch in task 
adding and removal log-lines should point to a memory leak. This is not true 
today as the task removal log-line is printed in removeMarkedTasks() (caller of 
removeTaskEntry(), the api responsible for removing a task) which is not called 
for every task thats got added to the JobTracker. The log lines introduced are 
not in some loop and will be printed only once per task attempt. 

bq. The bug you point to is irrelevant in the current context i.e. 
JobInProgress.getTasks(TaskType) - '==' or equals is the right implementation.
Looks like hadoop.io serializes enum as strings hence the jvm bug I pointed out 
doesnt hold here.
----
MAPREDUCE-1316 was raised because there was a mismatch between task-attempt 
addition and task-attempt removal in the JobTracker. The problem was that once 
the job retires, the job tasks are removed based on the statuses available.  
But task-status is added for a task-attempt only when the tasktracker returns 
back (once a task is assigned) with the next heartbeat. But there is a corner 
case in the removal logic.  If the tasktracker is assigned a task and the job 
finishes, then the newly scheduled attempt will be added to the JobTracker but 
will not be removed as its status is not yet available. This patch changes the 
task-removal logic by iterating over all the scheduled/launched attempt-ids 
instead of statuses thus taking care of the corner case mentioned above. 

> JobTracker holds stale references to retired jobs via unreported tasks 
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1316
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1316
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>            Priority: Blocker
>         Attachments: mapreduce-1316-v1.11.patch, 
> mapreduce-1316-v1.13-branch20-yahoo.patch, 
> mapreduce-1316-v1.14-branch20-yahoo.patch, 
> mapreduce-1316-v1.14.1-branch20-yahoo.patch, 
> mapreduce-1316-v1.15-branch20-yahoo.patch, mapreduce-1316-v1.7.patch
>
>
> JobTracker fails to remove _unreported_ tasks' mapping from _taskToTIPMap_ if 
> the job finishes and retires. _Unreported tasks_ refers to tasks that were 
> scheduled but the tasktracker did not report back with the task status. In 
> such cases a stale reference is held to TaskInProgress (and thus 
> JobInProgress) long after the job is gone leading to memory leak.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to