[ 
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663260#action_12663260
 ] 

Hemanth Yamijala commented on HADOOP-4981:
------------------------------------------

Vivek, I had separate conversations with Arun and Devaraj about this patch. 
Your patch does solve the requirement we have. However, we share your concern 
about the code structure - the passing of the flag top down to all APIs from 
the start, and the number of conditional checks. Also, we understand that the 
motivation to make the changes to obtainNewMapTask and family is so that we can 
retain the exact same code for looking up tasks, and thus take care of 
conditions like black listed trackers etc. So, it's a tradeoff between the code 
structure and the generic solution that is proposed.

We can make the tradeoff in the other way, by handling the condition for 
speculative tasks alone. Something like:
{code}
if (memory requirements pass) {
  return task from jobinprogress;
} else {
  if (job has pending or speculative tasks) {
    block;
  } else {
    move on to next job;
  }
}
{code}
This may require an API like findSpeculativeTask which only looks at code 
dealing with speculative tasks, and does not need to make changes to the core 
APIs like obtainNewMapTask. It seems (haven't looked at deeply), that this 
could simplify the code structure quite a bit, while addressing the requirement 
we currently have. The other cases do not seem to require special handling as 
of now. For e.g. if a tracker is blacklisted, it would anyway not return any 
task, so this is equivalent to the condition of job has no pending tasks. Would 
something on these lines work ?


> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-4981
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4981
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>            Priority: Blocker
>         Attachments: 4981.1.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a 
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask()) 
> only if the number of pending tasks for a job is greater than zero (see the 
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending 
> tasks and only has running tasks, it will never be given a slot, and will 
> never have a chance to run a speculative task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to