Re: YARN cluster underutilization

Sunil Govind Wed, 25 May 2016 09:56:03 -0700

Hi Jeff.

Thanks for sharing this information. I have some observations from this
logs.


- I think the node heartbeat is around 2/3 seconds here. Is it changed due
to some other reasons?
- And all mappers Resource Request seems to be asking for type ANY (there
is no data locality). pls correct me if I am wrong.

If the resource request type is ANY, only one container will be allocated
per heartbeat for a node. Here node heartbeat delay is also more. And I can
see that containers are released very fast too. So when u started you
application, are you seeing more better resource utilization? And once
containers started to get released/completed, you are seeing under
utilization.

Pls look into this line. It may be a reason.

Thanks
Sunil

On Wed, May 25, 2016 at 9:59 PM Guttadauro, Jeff <[email protected]>
wrote:

> Thanks for your thoughts thus far, Sunil.  Most grateful for any
> additional help you or others can offer.  To answer your questions,
>
>
>
> 1.       This is a custom M/R job, which uses mappers only (no reduce
> phase) to process GPS probe data and filter based on inclusion within a
> provided polygon.  There is actually a lot of upfront work done in the
> driver to make that task as simple as can be (identifies a list of tiles
> that are completely inside the polygon and those that fall across an edge,
> for which more processing would be needed), but the job would still be more
> compute-intensive than wordcount, for example.
>
>
>
> 2.       I’m running almost 84k mappers for this job.  This is actually
> down from ~600k mappers, since one other thing I’ve done is increased the
> mapreduce.input.fileinputformat.split.minsize to 536870912 (512M) for the
> job.  Data is in S3, so loss of locality isn’t really a concern.
>
>
>
> 3.       For NodeManager configuration, I’m using EMR’s default
> configuration for the m3.xlarge instance type, which is
> yarn.scheduler.minimum-allocation-mb=32,
> yarn.scheduler.maximum-allocation-mb=11520, and
> yarn.nodemanager.resource.memory-mb=11520.  YARN dashboard shows min/max
> allocations of <memory:32, vCores:1>/<memory:11520, vCores:8>.
>
>
>
> 4.       Capacity Scheduler [MEMORY]
>
>
>
> 5.       I’ve attached 2500 lines from the RM log.  Happy to grab more,
> but they are pretty big, and I thought that might be sufficient.
>
>
>
> Any guidance is much appreciated!
>
> -Jeff
>
>
>
> *From:* Sunil Govind [mailto:[email protected]]
> *Sent:* Wednesday, May 25, 2016 10:55 AM
> *To:* Guttadauro, Jeff <[email protected]>; [email protected]
> *Subject:* Re: YARN cluster underutilization
>
>
>
> Hi Jeff,
>
>
>
> It looks like to you are allocating more memory for AM container. Mostly
> you might not need 6Gb (as per the log). Could you please help  to provide
> some more information.
>
>
>
> 1. What type of mapreduce application (wordcount etc) are you running?
> Some AMs may be CPU intensive and some may not be. So based on the type
> application, memory/cpu can be tuned for better utilization.
>
> 2. How many mappers (reducers) are you trying to run here?
>
> 3. You have mentioned that each node has 8 cores and 15GB, but how much is
> actually configured for NM?
>
> 4. Which scheduler are you using?
>
> 5. Its better to attach RM log if possible.
>
>
>
> Thanks
>
> Sunil
>
>
>
> On Wed, May 25, 2016 at 8:58 PM Guttadauro, Jeff <[email protected]>
> wrote:
>
> Hi, all.
>
>
>
> I have an M/R (map-only) job that I’m running on a Hadoop 2.7.1 YARN
> cluster that is being quite underutilized (utilization of around 25-30%).
> The EMR cluster is 1 master + 20 core m3.xlarge nodes, which have 8 cores
> each and 15G total memory (with 11.25G of that available to YARN).  I’ve
> configured mapper memory with the following properties, which should allow
> for 8 containers running map tasks per node:
>
>
>
> <property><name>mapreduce.map.memory.mb</name><value>1440</value></property>
> <!-- Container size -->
>
> <property><name>mapreduce.map.java.opts</name><value>-Xmx1024m</value></property>
> <!-- JVM arguments for a Map task -->
>
>
>
> It was suggested that perhaps my AppMaster was having trouble keeping up
> with creating all the mapper containers and that I bulk up its resource
> allocation.  So I did, as shown below, providing it 6G container memory (5G
> task memory), 3 cores, and 60 task listener threads.
>
>
>
> <property><name>yarn.app.mapreduce.am.job.task.listener.thread-count</name><value>60</value></property>
> <!-- App Master task listener threads -->
>
> <property><name>yarn.app.mapreduce.am.resource.cpu-vcores</name><value>3</value></property>
> <!-- App Master container vcores -->
>
> <property><name>yarn.app.mapreduce.am.resource.mb</name><value>6400</value></property>
> <!-- App Master container size -->
>
> <property><name>yarn.app.mapreduce.am.command-opts</name><value>-Xmx5120m</value></property>
> <!-- JVM arguments for each Application Master -->
>
>
>
> Taking a look at the node on which the AppMaster is running, I'm seeing
> plenty of CPU idle time and free memory, yet there are still nodes with no
> utilization (0 running containers).  The log indicates that the AppMaster
> has way more memory (physical/virtual) than it appears to need with
> repeated log messages like this:
>
>
>
> 2016-05-25 13:59:04,615 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
> (Container Monitor): Memory usage of ProcessTree 11265 for container-id
> container_1464122327865_0002_01_000001: 1.6 GB of 6.3 GB physical memory
> used; 6.1 GB of 31.3 GB virtual memory used
>
>
>
> Can you please help me figure out where to go from here to troubleshoot,
> or any other things to try?
>
>
>
> Thanks!
>
> -Jeff
>
>
>
>

Re: YARN cluster underutilization

Reply via email to