Thanks Wangda, I think I have reduced this when I was trying to reduce the container allocation time.
-Kishore On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wheele...@gmail.com> wrote: > Hi Krishna, > > 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in > your configuration? > 50 > > I think this config is problematic, too small heartbeat-interval will > cause NM contact RM too often. I would suggest you can set this value > larger like 1000. > > Thanks, > Wangda > > > > On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > >> Hi Wangda, >> Thanks for the reply, here are the details, please see if you could >> suggest anything. >> >> 1) Number of nodes and running app in the cluster >> 2 nodes, and I am running my own application that keeps asking for >> containers, >> a) running something on the containers, >> b) releasing the containers, >> c) ask for more containers with incremented priority value, and repeat >> the same process >> >> 2) What's the version of your Hadoop? >> apache hadoop-2.4.0 >> >> 3) Have you set >> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true? >> No >> >> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" >> in your configuration? >> 50 >> >> >> >> >> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wheele...@gmail.com> wrote: >> >>> Hi Krishna, >>> To get more understanding about the problem, could you please share >>> following information: >>> 1) Number of nodes and running app in the cluster >>> 2) What's the version of your Hadoop? >>> 3) Have you set >>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true? >>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" >>> in your configuration? >>> >>> Thanks, >>> Wangda Tan >>> >>> >>> >>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri < >>> write2kish...@gmail.com> wrote: >>> >>>> Hi, >>>> My YARN resource manager is consuming 100% CPU when I am running an >>>> application that is running for about 10 hours, requesting as many as 27000 >>>> containers. The CPU consumption was very low at the starting of my >>>> application, and it gradually went high to over 100%. Is this a known issue >>>> or are we doing something wrong? >>>> >>>> Every dump of the EVent Processor thread is running >>>> LeafQueue::assignContainers() specifically the for loop below from >>>> LeafQueue.java and seems to be looping through some priority list. >>>> >>>> // Try to assign containers to applications in order >>>> for (FiCaSchedulerApp application : activeApplications) { >>>> ... >>>> // Schedule in priority order >>>> for (Priority priority : application.getPriorities()) { >>>> >>>> 3XMTHREADINFO "ResourceManager Event Processor" >>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >>>> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >>>> 3XMTHREADINFO1 (native thread ID:0x4B64, native >>>> priority:0x5, native policy:UNKNOWN) >>>> 3XMTHREADINFO2 (native stack address range >>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >>>> 3XMCPUTIME *CPU usage total: 42334.614623696 secs* >>>> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=20456 >>>> (0x4FE8) >>>> 3XMTHREADINFO3 Java callstack: >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >>>> entry count: 1) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 2) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >>>> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >>>> >>>> 3XMTHREADINFO "ResourceManager Event Processor" >>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >>>> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >>>> 3XMTHREADINFO1 (native thread ID:0x4B64, native >>>> priority:0x5, native policy:UNKNOWN) >>>> 3XMTHREADINFO2 (native stack address range >>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >>>> 3XMCPUTIME CPU usage total: 42379.604203548 secs >>>> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=57280 >>>> (0xDFC0) >>>> 3XMTHREADINFO3 Java callstack: >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >>>> entry count: 1) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 2) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >>>> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >>>> >>>> 3XMTHREADINFO "ResourceManager Event Processor" >>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00, >>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5 >>>> 3XMJAVALTHREAD (java/lang/Thread getId:0x1E, isDaemon:false) >>>> 3XMTHREADINFO1 (native thread ID:0x4B64, native >>>> priority:0x5, native policy:UNKNOWN) >>>> 3XMTHREADINFO2 (native stack address range >>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000) >>>> 3XMCPUTIME CPU usage total: 42996.394528764 secs >>>> 3XMHEAPALLOC Heap bytes allocated since last GC >>>> cycle=475576 (0x741B8) >>>> 3XMTHREADINFO3 Java callstack: >>>> 4XESTACKTRACE at >>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code)) >>>> 4XESTACKTRACE at >>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0, >>>> entry count: 1) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 2) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled >>>> Code)) >>>> 5XESTACKTRACE (entered lock: >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8, >>>> entry count: 1) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled >>>> Code)) >>>> 4XESTACKTRACE at >>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591) >>>> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:853) >>>> >>>> Thanks, >>>> Kishore >>>> >>> >>> >> >