[
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666445#action_12666445
]
Arun C Murthy commented on HADOOP-4981:
---------------------------------------
bq. I thought the real issue here was how to write clean code to detect if a
job has a task to run, i.e., it's more of a software design problem rather than
a performance issue. We can certainly discuss/re-discuss whether it makes sense
to block always or once in a while, but that seems like another discussion. Am
I missing something?
I'd support a clean abstraction, but this change modifies substantial parts of
the Map-Reduce framework in ways difficult to understand for a relatively
uncommon corner-case. In any case a high-mem job might not have a task to run
at a given moment, but what happens when it's running tasks fail, tasktrackers
go down etc. ?
>From a design perspective we have to recognize that currently the Map-Reduce
>framework isn't fundamentally setup for what you are trying to do; maybe it
>might be possible if we move the scheduling-related aspects from JobInProgress
>to the respective Schedulers - a bigger discussion.
Given these I'd rather sacrifice utilization for maintainable code...
-1 for trying to add capabilities to detect whether a job has tasks to run
without significant redesign.
> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
> Key: HADOOP-4981
> URL: https://issues.apache.org/jira/browse/HADOOP-4981
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Vivek Ratan
> Priority: Blocker
> Attachments: 4981.1.patch, 4981.2.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask())
> only if the number of pending tasks for a job is greater than zero (see the
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending
> tasks and only has running tasks, it will never be given a slot, and will
> never have a chance to run a speculative task.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.