[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659621#comment-13659621
 ] 

Arun C Murthy commented on MAPREDUCE-4366:
------------------------------------------

Sorry, I've had a hard time coming around to this.

{quote}
There didn't seem to be a clear definition of speculative(Map|Reduce)Tasks, so 
the one I came up with is that the number of speculative(Map|Reduce)Tasks is 
the number of attempts running that are not on the critical path of the job 
completing. This makes sense in the context of computing pending(Map|Reduce)s, 
which is the only place the variable is used.
{quote}

Thanks for the explanation.

The definition of speculative(Map|Reduce)Tasks, at least in my head, has been 
the number of task-attempts have an alternate... no, it's not a great one, or a 
documented one! *smile* 

However, this has been the basis for a number of assumptions related to 
computing pending tasks etc. in various schedulers. (See call hierarchy for 
JIP.pendingTasks).

Since your change re-defines this, I'm afraid it breaks schedulers e.g. 
CapacityScheduler. Hence, I'm against the change.

I fully agree it isn't ideal, but I'd rather not make invasive changes in MR1 - 
the JT/JIP/Scheduler nexus scares me a lot... in fact, I'm officially terrified 
of it! *smile*

Now, to get around the metrics problem, how about making a more local change in 
JIP.garbageCollect? 

An option is to just call decWaiting(Maps|Reduces) in JIP.garbageCollect with 
JIP.num(Maps|Reduces)... currently if you follow the opposite side i.e 
addWaiting(Maps|Reduces), they are just static and are done at JIP.initTasks 
with num(Maps|Reduces). That would solve the immediate problem at hand?

Thoughts?

----

Thanks again for checking in with me, and being patient in working through the 
mess we have!
                
> mapred metrics shows negative count of waiting maps and reduces
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-4366
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4366
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 1.0.2
>            Reporter: Thomas Graves
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-4366-branch-1-1.patch, 
> MAPREDUCE-4366-branch-1.patch
>
>
> Negative waiting_maps and waiting_reduces count is observed in the mapred 
> metrics.  MAPREDUCE-1238 partially fixed this but it appears there is still 
> issues as we are seeing it, but not as bad.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to