[
https://issues.apache.org/jira/browse/MAPREDUCE-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759058#action_12759058
]
Devaraj Das commented on MAPREDUCE-1028:
----------------------------------------
After some thought, it seems like decrementing the slot count on a per
task-used-slot count basis is harmless.. So, for now, let's just ensure that
all special tasks (job-setup, task-cleanup and job-cleanup) take exactly one
slot. I couldn't come up with a counter-example where this would lead to
inconsistencies in the slot counts on the TT, or, would lead to fewer/more
tasks to be launched than should be as per the slot count and the #slots
required by tasks scheduled on that TT.
> Cleanup tasks are scheduled using high memory configuration, leaving tasks in
> unassigned state.
> -----------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1028
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1028
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 0.21.0
> Reporter: Hemanth Yamijala
> Assignee: Ravi Gummadi
> Priority: Blocker
> Fix For: 0.21.0
>
>
> A cleanup task is launched for a failed task of a job. This task is created
> based on the TIP of the failed task, and so is marked as requiring as many
> slots to run as the original task itself. For instance, if a high RAM job
> requires 2 slots per task, a cleanup task of the high RAM jobs requires 2
> slots as well.
> Further, a cleanup task is scheduled to a tasktracker by the jobtracker
> itself and not the scheduler. While doing so, the JT doesn't check if the TT
> has enough slots free to run a high RAM cleanup task - always assuming 1 slot
> is enough. Thus, a task is oversubscribed to the TT.
> However, on the TT, before launch, we check that the task can actually run,
> and wait for so many slots to become available. If the slots don't get freed
> quickly, we will have tasks stuck in an unassigned state.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.