[ 
https://issues.apache.org/jira/browse/HADOOP-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566640#action_12566640
 ] 

Owen O'Malley commented on HADOOP-2790:
---------------------------------------

{quote}
As suggested by Devaraj, the time can be calculated in 
JobInProgress.findNewTask() and use this value in 
TaskInProgress.hasSpeculative(). 
{quote}

+1 this is the clear solution

Please do move the runSpeculative check up.

I think the boolean for the commit pending would be pretty easy. If we can 
avoid calling .values(), we will avoid creating a second collection for each 
tip. Another point is that some of our customers run with the max task failures 
set to 100, so it is *not* free to scan the tasks in a tip.

> TaskInProgress.hasSpeculativeTask is very inefficient
> -----------------------------------------------------
>
>                 Key: HADOOP-2790
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2790
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>             Fix For: 0.16.1
>
>
> Each call to JobInProgress.findNewTask can call 
> TaskInProgress.hasSpeculativeTask once per a task. Each call to 
> hasSpeculativeTask calls System.getCurrentTimeMillis, which can result in 
> hundreds of thousands of calls to getCurrentTimeMillis. Additionally, it 
> calls TaskInProgress.isOnlyCommitPending, which calls .values() on the map 
> from task id to host name and iterates through them to see if any of the 
> tasks are in commit pending. It would be better to have a commit pending 
> boolean flag in the TaskInProgress. It also looks like there are other 
> opportunities here, but those jumped out at me. We should also look at this 
> method in the profiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to