[
https://issues.apache.org/jira/browse/MAPREDUCE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Noguchi updated MAPREDUCE-1684:
------------------------------------
Attachment: mapreduce-1684-v1.0.2-1.patch
bq. Currently, CapacityTaskScheduler.assignTasks() calls getClusterStatus()
thrice
I think it calls getClusterStatus calls #jobs times in the worst case.
For each heartbeat from TaskTracker with some slots available,
{noformat}
heartbeat --> assignTasks
--> addMap/ReduceTasks
--> TaskSchedulingMgr.assignTasks
--> For each queue : queuesForAssigningTasks)
--> getTaskFromQueue(queue)
--> For each j : queue.getRunningJobs()
--> obtainNewTask --> **getClusterStatus**
{noformat}
bq. It can be cached in assignTasks() and re-used.
Attaching a patch. Would this work?
Motivation is, we see getClusterStatus way too often in our jstack holding the
global lock.
{noformat}
"IPC Server handler 15 on 50300" daemon prio=10 tid=0x000000005fc5d800
nid=0x6828 runnable [0x0000000044847000]
java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.mapred.JobTracker.getClusterStatus(JobTracker.java:4065)
- locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker)
at
org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:503)
at
org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322)
at
org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419)
at
org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150)
at
org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075)
at
org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044)
- locked <0x00002aab6e7ffb10> (a
org.apache.hadoop.mapred.CapacityTaskScheduler)
at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398)
- locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
{noformat}
> ClusterStatus can be cached in CapacityTaskScheduler.assignTasks()
> ------------------------------------------------------------------
>
> Key: MAPREDUCE-1684
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1684
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: capacity-sched
> Reporter: Amareshwari Sriramadasu
> Attachments: mapreduce-1684-v1.0.2-1.patch
>
>
> Currently, CapacityTaskScheduler.assignTasks() calls getClusterStatus()
> thrice: once in assignTasks(), once in MapTaskScheduler and once in
> ReduceTaskScheduler. It can be cached in assignTasks() and re-used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira