[
https://issues.apache.org/jira/browse/HADOOP-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566496#action_12566496
]
Amar Kamat commented on HADOOP-2790:
------------------------------------
{quote}
!isOnlyCommitPending() :
May be a Map for COMMIT_PENDING tasks can be maintained in TaskInProgress and
the only check made is commitPendingStatuses.size() > 0 &&
commitPendingStatuses.contains(taskId). The space requirement will be same with
a re-arrangement to be done in TaskInProgress.recomputeProgress().
{quote}
Actually the list of task statuses will be pretty small so either we can do
what is currently done or maintain a boolean flag as Owen mentioned, +1.
{quote}
System.currentTimeMillis() - startTime >= SPECULATIVE_LAG
{quote}
As suggested by Devaraj, the time can be calculated in
{{JobInProgress.findNewTask()}} and use this value in
{{TaskInProgress.hasSpeculative()}}. The chances of ignoring a TIP for
speculation that should be speculated will be extremely low as compared to
using the time in {{TaskInProgress.recomputeProgress()}}.
> TaskInProgress.hasSpeculativeTask is very inefficient
> -----------------------------------------------------
>
> Key: HADOOP-2790
> URL: https://issues.apache.org/jira/browse/HADOOP-2790
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Owen O'Malley
> Fix For: 0.16.1
>
>
> Each call to JobInProgress.findNewTask can call
> TaskInProgress.hasSpeculativeTask once per a task. Each call to
> hasSpeculativeTask calls System.getCurrentTimeMillis, which can result in
> hundreds of thousands of calls to getCurrentTimeMillis. Additionally, it
> calls TaskInProgress.isOnlyCommitPending, which calls .values() on the map
> from task id to host name and iterates through them to see if any of the
> tasks are in commit pending. It would be better to have a commit pending
> boolean flag in the TaskInProgress. It also looks like there are other
> opportunities here, but those jumped out at me. We should also look at this
> method in the profiler.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.