[
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663260#action_12663260
]
Hemanth Yamijala commented on HADOOP-4981:
------------------------------------------
Vivek, I had separate conversations with Arun and Devaraj about this patch.
Your patch does solve the requirement we have. However, we share your concern
about the code structure - the passing of the flag top down to all APIs from
the start, and the number of conditional checks. Also, we understand that the
motivation to make the changes to obtainNewMapTask and family is so that we can
retain the exact same code for looking up tasks, and thus take care of
conditions like black listed trackers etc. So, it's a tradeoff between the code
structure and the generic solution that is proposed.
We can make the tradeoff in the other way, by handling the condition for
speculative tasks alone. Something like:
{code}
if (memory requirements pass) {
return task from jobinprogress;
} else {
if (job has pending or speculative tasks) {
block;
} else {
move on to next job;
}
}
{code}
This may require an API like findSpeculativeTask which only looks at code
dealing with speculative tasks, and does not need to make changes to the core
APIs like obtainNewMapTask. It seems (haven't looked at deeply), that this
could simplify the code structure quite a bit, while addressing the requirement
we currently have. The other cases do not seem to require special handling as
of now. For e.g. if a tracker is blacklisted, it would anyway not return any
task, so this is equivalent to the condition of job has no pending tasks. Would
something on these lines work ?
> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
> Key: HADOOP-4981
> URL: https://issues.apache.org/jira/browse/HADOOP-4981
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Vivek Ratan
> Priority: Blocker
> Attachments: 4981.1.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask())
> only if the number of pending tasks for a job is greater than zero (see the
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending
> tasks and only has running tasks, it will never be given a slot, and will
> never have a chance to run a speculative task.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.