[
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666427#action_12666427
]
Vivek Ratan commented on HADOOP-4981:
-------------------------------------
If I understand correctly, you're suggesting that we skip a job if it has mem
requirements that cannot be met, while making sure we don't skip it too many
times to starve it. As opposed to blocking right way (i.e, returning no task to
the TT).
We did consider this approach a while back, and the general consensus was that
it's better to block right away than to selectively block. Regardless, I don't
think that solves the problem this Jira is addressing. Whether you do it once
in a while, or always, you're still going to need to look at a high-mem job at
some point and decide whether to block the TT or not. And you're still going to
need to see if the high-mem job has at least one task to run. of course, you
could skip this step by always blocking the TT, but then you would have
underutilized TTs if the high-mem job does not have any more task to run.
I thought the real issue here was how to write clean code to detect if a job
has a task to run, i.e., it's more of a software design problem rather than a
performance issue. We can certainly discuss/re-discuss whether it makes sense
to block always or once in a while, but that seems like another discussion. Am
I missing something?
> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
> Key: HADOOP-4981
> URL: https://issues.apache.org/jira/browse/HADOOP-4981
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Vivek Ratan
> Priority: Blocker
> Attachments: 4981.1.patch, 4981.2.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask())
> only if the number of pending tasks for a job is greater than zero (see the
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending
> tasks and only has running tasks, it will never be given a slot, and will
> never have a chance to run a speculative task.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.