Hi, all.

I have an M/R (map-only) job that I'm running on a Hadoop 2.7.1 YARN cluster 
that is being quite underutilized (utilization of around 25-30%).  The EMR 
cluster is 1 master + 20 core m3.xlarge nodes, which have 8 cores each and 15G 
total memory (with 11.25G of that available to YARN).  I've configured mapper 
memory with the following properties, which should allow for 8 containers 
running map tasks per node:

<property><name>mapreduce.map.memory.mb</name><value>1440</value></property>   
<!-- Container size -->
<property><name>mapreduce.map.java.opts</name><value>-Xmx1024m</value></property>
  <!-- JVM arguments for a Map task -->

It was suggested that perhaps my AppMaster was having trouble keeping up with 
creating all the mapper containers and that I bulk up its resource allocation.  
So I did, as shown below, providing it 6G container memory (5G task memory), 3 
cores, and 60 task listener threads.

<property><name>yarn.app.mapreduce.am.job.task.listener.thread-count</name><value>60</value></property>
  <!-- App Master task listener threads -->
<property><name>yarn.app.mapreduce.am.resource.cpu-vcores</name><value>3</value></property>
  <!-- App Master container vcores -->
<property><name>yarn.app.mapreduce.am.resource.mb</name><value>6400</value></property>
  <!-- App Master container size -->
<property><name>yarn.app.mapreduce.am.command-opts</name><value>-Xmx5120m</value></property>
  <!-- JVM arguments for each Application Master -->


Taking a look at the node on which the AppMaster is running, I'm seeing plenty 
of CPU idle time and free memory, yet there are still nodes with no utilization 
(0 running containers).  The log indicates that the AppMaster has way more 
memory (physical/virtual) than it appears to need with repeated log messages 
like this:



2016-05-25 13:59:04,615 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
 (Container Monitor): Memory usage of ProcessTree 11265 for container-id 
container_1464122327865_0002_01_000001: 1.6 GB of 6.3 GB physical memory used; 
6.1 GB of 31.3 GB virtual memory used



Can you please help me figure out where to go from here to troubleshoot, or any 
other things to try?

Thanks!
-Jeff

Reply via email to