[
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663277#action_12663277
]
Vivek Ratan commented on HADOOP-4981:
-------------------------------------
For the sake of this discussion, let's consider my original proposal (which
modifies JobInProgress.findNewXXXTask(), and for which a patch is provided) as
Proposal 1, and your suggestion about detecting only if a job has speculative
tasks as Proposal 2.
I did consider Proposal 2, but I frankly didn't see a big difference. If you
look at the code for JobInProgress.findNewMapTask(), which really is the big
core method that needs to be updated, the first part of it deals with finding a
pending task to run, and the second part of it deals with finding a speculative
task. The latter code is non-trivial, i.e., it involves 3 stages and about 60
lines of code. Proposal 2 will result in fewer if-blocks, but the changes will
qualitatively be the same (i.e., you'll still have to pass the readOnly
parameter and still wrap data structure changes and log methods with if-blocks,
albeit only for these 60 lines). A quick glance at my patch shows that you'll
cut down the if-blocks from about 17 to about 7. On one hand, that is a
decrease, but it's not qualitatively different and you'll likely move the code
to find speculative tasks into a separate method, thus adding to the
refactoring of this core method. Either way, it looks like we'll have to bite
the bullet no matter what, if you're considering only these two proposals.
One issue I do have with Proposal 2 is that we're replicating the logic of
finding a task to run, in the Capacity Scheduler. In your code example, you're
assuming that if a job does not have pending or speculative tasks,
JobInProgress.obtainNewMapTask() will return null. That's not such a good
assumption for a couple of related reasons:
a. if this logic changes in JobInProgress, you have to update the Capacity
Scheduler code. Granted that this logic is likely not going to change, or that
eventually all this code belongs in a Scheduler module anyways, so maybe this
is not a big issue.
b. findNewMapTask() has additional logic that you should consider: whether the
TT is blacklisted (shouldRunOnTaskTracker()), and whether the TT has enough
disk space. The latter we probably don't care about much, because the disk
space, like the amount of free memory, gets better if we let the TT finish its
tasks. The former,however, is important. If no task in the job can run on the
TT (because many tasks of the job have failed on that TT), then the scheduler
logic needs to reflect that (i.e., the scheduler blocks if there are pending or
speculative tasks AND the TT is not blacklisted). Which ties in to my first
point: you're replicating the logic of findNewMapTask() in the scheduler, which
doesn't buy you much, and if you do it, you'll need to separate out the various
steps in the logic (finding a pending tasks, finding a speculative task ) into
separate methods, and make JobInProgress.shouldRunOnTaskTracker() public.
Basically, I don't see Proposal 2 as qualitatively better than Proposal 1,
which is why I didn't consider it. In its favor, it cuts down the if-blocks by
more than half. But is that enough reason to replicate the findNewMapTask()
logic in the scheduler? I personally think not.
Also remember that in proposal 1, we're not altering the signature of
JobInProgress in the sense that its current callers do not change. It still
exposes obtainNewMapTask(), with the same parameters. We've added a new method,
hasNewMapTask(). It's only internally that we use the ugly readOnly flag. A
minor point, but I wanted to point it out.
bq. This may require an API like findSpeculativeTask which only looks at code
dealing with speculative tasks, and does not need to make changes to the core
APIs like obtainNewMapTask
Depends on how you see it. The code that deals with finding speculative tasks
is, today, within the core API findNewMapTask(). You'll still affect the core
API if you move it out, but yes, I get your point that you won't have to
surround the remaining code by if-blocks.
> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
> Key: HADOOP-4981
> URL: https://issues.apache.org/jira/browse/HADOOP-4981
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/capacity-sched
> Reporter: Vivek Ratan
> Priority: Blocker
> Attachments: 4981.1.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask())
> only if the number of pending tasks for a job is greater than zero (see the
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending
> tasks and only has running tasks, it will never be given a slot, and will
> never have a chance to run a speculative task.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.