Hi Krishna, To get more understanding about the problem, could you please share following information: 1) Number of nodes and running app in the cluster 2) What's the version of your Hadoop? 3) Have you set "yarn.scheduler.capacity.schedule-asynchronously.enable"=true? 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in your configuration?
Thanks, Wangda Tan On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri < [email protected]> wrote: > Hi, > My YARN resource manager is consuming 100% CPU when I am running an > application that is running for about 10 hours, requesting as many as 27000 > containers. The CPU consumption was very low at the starting of my > application, and it gradually went high to over 100%. Is this a known issue > or are we doing something wrong? > > Every dump of the EVent Processor thread is running > LeafQueue::assignContainers() specifically the for loop below from > LeafQueue.java and seems to be looping through some priority list. > > // Try to assign containers to applications in order > for (FiCaSchedulerApp application : activeApplications) { > ... > // Schedule in priority order > for (Priority priority : application.getPriorities()) { > > 3XMTHREADINFO "ResourceManager Event Processor" > J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, > java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 > 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) > 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, > native policy:UNKNOWN) > 3XMTHREADINFO2 (native stack address range > from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) > 3XMCPUTIME *CPU usage total: 42334.614623696 secs* > 3XMHEAPALLOC Heap bytes allocated since last GC cycle=20456 > (0x4FE8) > 3XMTHREADINFO3 Java callstack: > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, > entry count: 1) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 2) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) > 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) > > 3XMTHREADINFO "ResourceManager Event Processor" > J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, > java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 > 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) > 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, > native policy:UNKNOWN) > 3XMTHREADINFO2 (native stack address range > from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) > 3XMCPUTIME CPU usage total: 42379.604203548 secs > 3XMHEAPALLOC Heap bytes allocated since last GC cycle=57280 > (0xDFC0) > 3XMTHREADINFO3 Java callstack: > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, > entry count: 1) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 2) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) > 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) > > 3XMTHREADINFO "ResourceManager Event Processor" > J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, > java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 > 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) > 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, > native policy:UNKNOWN) > 3XMTHREADINFO2 (native stack address range > from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) > 3XMCPUTIME CPU usage total: 42996.394528764 secs > 3XMHEAPALLOC Heap bytes allocated since last GC cycle=475576 > (0x741B8) > 3XMTHREADINFO3 Java callstack: > 4XESTACKTRACE at > java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code)) > 4XESTACKTRACE at > java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled > Code)) > 4XESTACKTRACE at > java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, > entry count: 1) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 2) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled > Code)) > 5XESTACKTRACE (entered lock: > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, > entry count: 1) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled > Code)) > 4XESTACKTRACE at > org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) > 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) > > Thanks, > Kishore >
