[
https://issues.apache.org/jira/browse/HADOOP-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566640#action_12566640
]
Owen O'Malley commented on HADOOP-2790:
---------------------------------------
{quote}
As suggested by Devaraj, the time can be calculated in
JobInProgress.findNewTask() and use this value in
TaskInProgress.hasSpeculative().
{quote}
+1 this is the clear solution
Please do move the runSpeculative check up.
I think the boolean for the commit pending would be pretty easy. If we can
avoid calling .values(), we will avoid creating a second collection for each
tip. Another point is that some of our customers run with the max task failures
set to 100, so it is *not* free to scan the tasks in a tip.
> TaskInProgress.hasSpeculativeTask is very inefficient
> -----------------------------------------------------
>
> Key: HADOOP-2790
> URL: https://issues.apache.org/jira/browse/HADOOP-2790
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Owen O'Malley
> Fix For: 0.16.1
>
>
> Each call to JobInProgress.findNewTask can call
> TaskInProgress.hasSpeculativeTask once per a task. Each call to
> hasSpeculativeTask calls System.getCurrentTimeMillis, which can result in
> hundreds of thousands of calls to getCurrentTimeMillis. Additionally, it
> calls TaskInProgress.isOnlyCommitPending, which calls .values() on the map
> from task id to host name and iterates through them to see if any of the
> tasks are in commit pending. It would be better to have a commit pending
> boolean flag in the TaskInProgress. It also looks like there are other
> opportunities here, but those jumped out at me. We should also look at this
> method in the profiler.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.