Hi Wangda, Thanks for the reply, here are the details, please see if you could suggest anything.
1) Number of nodes and running app in the cluster 2 nodes, and I am running my own application that keeps asking for containers, a) running something on the containers, b) releasing the containers, c) ask for more containers with incremented priority value, and repeat the same process 2) What's the version of your Hadoop? apache hadoop-2.4.0 3) Have you set "yarn.scheduler.capacity.schedule-asynchronously.enable"=true? No 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in your configuration? 50 On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wheele...@gmail.com> wrote: > Hi Krishna, > To get more understanding about the problem, could you please share > following information: > 1) Number of nodes and running app in the cluster > 2) What's the version of your Hadoop? > 3) Have you set > "yarn.scheduler.capacity.schedule-asynchronously.enable"=true? > 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in > your configuration? > > Thanks, > Wangda Tan > > > > On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > >> Hi, >> My YARN resource manager is consuming 100% CPU when I am running an >> application that is running for about 10 hours, requesting as many as 27000 >> containers. The CPU consumption was very low at the starting of my >> application, and it gradually went high to over 100%. Is this a known issue >> or are we doing something wrong? >> >> Every dump of the EVent Processor thread is running >> LeafQueue::assignContainers() specifically the for loop below from >> LeafQueue.java and seems to be looping through some priority list. >> >> // Try to assign containers to applications in order >> for (FiCaSchedulerApp application : activeApplications) { >> ... >> // Schedule in priority order >> for (Priority priority : application.getPriorities()) { >> >> 3XMTHREADINFO "ResourceManager Event Processor" >> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >> 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, >> native policy:UNKNOWN) >> 3XMTHREADINFO2 (native stack address range >> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >> 3XMCPUTIME *CPU usage total: 42334.614623696 secs* >> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=20456 >> (0x4FE8) >> 3XMTHREADINFO3 Java callstack: >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >> entry count: 1) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 2) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >> >> 3XMTHREADINFO "ResourceManager Event Processor" >> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >> 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, >> native policy:UNKNOWN) >> 3XMTHREADINFO2 (native stack address range >> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >> 3XMCPUTIME CPU usage total: 42379.604203548 secs >> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=57280 >> (0xDFC0) >> 3XMTHREADINFO3 Java callstack: >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >> entry count: 1) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 2) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >> >> 3XMTHREADINFO "ResourceManager Event Processor" >> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >> 3XMTHREADINFO1 (native thread ID:0x4B64, native priority:0x5, >> native policy:UNKNOWN) >> 3XMTHREADINFO2 (native stack address range >> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >> 3XMCPUTIME CPU usage total: 42996.394528764 secs >> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=475576 >> (0x741B8) >> 3XMTHREADINFO3 Java callstack: >> 4XESTACKTRACE at >> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code)) >> 4XESTACKTRACE at >> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled >> Code)) >> 4XESTACKTRACE at >> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >> entry count: 1) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 2) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >> Code)) >> 5XESTACKTRACE (entered lock: >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >> entry count: 1) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >> Code)) >> 4XESTACKTRACE at >> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >> >> Thanks, >> Kishore >> > >