[
https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703265#action_12703265
]
Devaraj Das commented on HADOOP-2141:
-------------------------------------
Eric, I am not sure how much of a good gain would the locality aspect be.
Considering that we will have very few tasks that we launch speculatively, the
probability that a TT comes and gets a data local spec task would be quite low
IMO. But yes, it makes sense to keep the existing logic for running node/rack
local speculative task around.. So I'd suggest something like:
if (TT is not slow) {
if (exists node-local task that is running slower than others) {
assign that task to the TT
} else {
assign some task from the higher level rack-cache if available; else look
at the entire list of running TIPs to find a slow task
}
}
The above is essentially the same as what happens in today's trunk. The only
additional constraint we are adding here is the check for whether a TT is GOOD
(meets the criteria for running spec tasks).
> speculative execution start up condition based on completion time
> -----------------------------------------------------------------
>
> Key: HADOOP-2141
> URL: https://issues.apache.org/jira/browse/HADOOP-2141
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.21.0
> Reporter: Koji Noguchi
> Assignee: Andy Konwinski
> Attachments: 2141.patch, HADOOP-2141-v2.patch, HADOOP-2141-v3.patch,
> HADOOP-2141-v4.patch, HADOOP-2141-v5.patch, HADOOP-2141-v6.patch,
> HADOOP-2141.patch
>
>
> We had one job with speculative execution hang.
> 4 reduce tasks were stuck with 95% completion because of a bad disk.
> Devaraj pointed out
> bq . One of the conditions that must be met for launching a speculative
> instance of a task is that it must be at least 20% behind the average
> progress, and this is not true here.
> It would be nice if speculative execution also starts up when tasks stop
> making progress.
> Devaraj suggested
> bq. Maybe, we should introduce a condition for average completion time for
> tasks in the speculative execution check.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.