[
https://issues.apache.org/jira/browse/HADOOP-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676242#action_12676242
]
Devaraj Das commented on HADOOP-2141:
-------------------------------------
The field TaskInProgress.mostRecentStartTime is updated with the same value of
execStartTime each time (since execStartTime is updated only once in the life
of the TIP). Did you mean to do this?
bq. I am getting a little lost digging through the code trying to figure out
where these variables would need to be decremented at
They should be decremented in TIP.incompleteSubTask and TIP.completedTask
(basically, places where activeTasks.remove) is done. The decrement should
happen if activeTasks.size for the TIP is >1. Makes sense?
bq. I can't find the TaskCommitThread that it references
Yes this comment shouldn't be there. TaskCommitThread used to be there at one
point..
bq. there would be a possibility of speculating a task that has already
completed.
Couldn't it be checked whether TIP.isComplete() returns true before launching a
speculative attempt?
> speculative execution start up condition based on completion time
> -----------------------------------------------------------------
>
> Key: HADOOP-2141
> URL: https://issues.apache.org/jira/browse/HADOOP-2141
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.19.0
> Reporter: Koji Noguchi
> Assignee: Andy Konwinski
> Attachments: 2141.patch, HADOOP-2141-v2.patch, HADOOP-2141-v3.patch,
> HADOOP-2141-v4.patch, HADOOP-2141-v5.patch, HADOOP-2141.patch
>
>
> We had one job with speculative execution hang.
> 4 reduce tasks were stuck with 95% completion because of a bad disk.
> Devaraj pointed out
> bq . One of the conditions that must be met for launching a speculative
> instance of a task is that it must be at least 20% behind the average
> progress, and this is not true here.
> It would be nice if speculative execution also starts up when tasks stop
> making progress.
> Devaraj suggested
> bq. Maybe, we should introduce a condition for average completion time for
> tasks in the speculative execution check.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.