[ 
https://issues.apache.org/jira/browse/HADOOP-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566474#action_12566474
 ] 

Amar Kamat commented on HADOOP-2790:
------------------------------------

bq. Also, another obvious optimization is to check whether the speculative 
execution flag is true up front.

Even I noticed that few days back. But I thought HADOOP-2141 might fix that.  
----
With HADOOP-2119, the calls to {{hasSpeculative()}} might reduce since we are 
optimizing the look-ups for finding the higher priority runnable tasks and 
totally avoiding speculative ones in these look-ups. So the check for 
speculative tasks will be done only if we have nothing else to run. But +1 to 
do it better than making all the checks all the time. 
Following are the parameters used for deciding 
{{TaskInProgress.hasSpeculative()}} :
- activeTasks.size() <= MAX_TASK_EXECS _[seems ok]_
- runSpeculative _[should be done earlier, but ok]_
- averageProgress - progress >= SPECULATIVE_GAP _[seems ok]_
- System.currentTimeMillis() - startTime >= SPECULATIVE_LAG :
    This could be checked once in {{TaskInProgress.recomputeProgress()}} and a 
check will only be done in {{hasSpeculative()}} if the earlier check resulted 
as {{false}}. I guess we can still do better but my guess is that we cant 
totally avoid {{System.currentTimeMillis()}} in 
{{TaskInProgress.hasSpeculative()}}, no?
- completes == 0 _[ok]_
- !isOnlyCommitPending() :
    May be a Map for _COMMIT_PENDING_ tasks can be maintained in 
_TaskInProgress_ and the only check made is {{commitPendingStatuses.size() > 0 
&& commitPendingStatuses.contains(taskId)}}. The space requirement will be same 
with a re-arrangement to be done in {{TaskInProgress.recomputeProgress()}}.
----
Comments?

> TaskInProgress.hasSpeculativeTask is very inefficient
> -----------------------------------------------------
>
>                 Key: HADOOP-2790
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2790
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>             Fix For: 0.16.1
>
>
> Each call to JobInProgress.findNewTask can call 
> TaskInProgress.hasSpeculativeTask once per a task. Each call to 
> hasSpeculativeTask calls System.getCurrentTimeMillis, which can result in 
> hundreds of thousands of calls to getCurrentTimeMillis. Additionally, it 
> calls TaskInProgress.isOnlyCommitPending, which calls .values() on the map 
> from task id to host name and iterates through them to see if any of the 
> tasks are in commit pending. It would be better to have a commit pending 
> boolean flag in the TaskInProgress. It also looks like there are other 
> opportunities here, but those jumped out at me. We should also look at this 
> method in the profiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to