[
https://issues.apache.org/jira/browse/MAPREDUCE-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776149#action_12776149
]
Arun C Murthy commented on MAPREDUCE-1204:
------------------------------------------
I'd rather have JobInProgress.shouldRunOnTaskTracker public for now.
Long term the right direction is for schedulers to maintain all scheduling
information themselves and not rely on JobInProgress at all.
> Fair Scheduler preemption may preempt tasks running in slots unusable by the
> preempting job
> -------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1204
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1204
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/fair-share
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
>
> The current preemption code works by first calculating how many tasks need to
> be preempted to satisfy the min share constraints, and then killing an equal
> number of tasks from other jobs, sorted to favor killing of young tasks. This
> works fine for the general case, but there are some edge cases where this can
> cause problems.
> For example, if the preempting job has blacklisted ("marked flaky") a
> particular task tracker, and that tracker is running the youngest task,
> preemption can still kill that task. The preempting job will then refuse that
> slot, since the tracker has been blacklisted. The same task that just got
> killed then gets rescheduled in that slot. This repeats ad infinitum until a
> new slot opens in the cluster.
> I don't have a good test case for this, yet, but logically it is possible.
> One potential fix would be to add an API to JobInProgress that functions
> identically to obtainNewMapTask but does not schedule the task. The
> preemption code could then use this while iterating through the sorted
> preemption list to check that the preempting jobs can actually make use of
> the candidate slots before killing them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.