Sadayuki Furuhashi created YARN-4039:
----------------------------------------

             Summary: New AM instances waste resource by waiting only for 
resource availability when all available resources are already used
                 Key: YARN-4039
                 URL: https://issues.apache.org/jira/browse/YARN-4039
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: fairscheduler
    Affects Versions: 2.7.0, 2.6.0, 2.5.0, 2.4.0
            Reporter: Sadayuki Furuhashi


New AM instances waste resource by waiting only for resource availability when 
all available resources are already used

Problem:
In FairScheduler, maxRunningApps doesn't work well if we can't predict size of 
an application in a queue because small maxRunningApps can't use all resources 
if many small applications are issued, where large maxRunningApps wastes 
resources if large applications run.

Background:
We're using FairScheduler. In following scenario, AM instances wastes resources 
significantly:

* A queue has X MB of capacity.
* An application requests 32 containers where a container requires (X / 32) MB 
of memory
** In this case, a single application occupies entire resource of the queue.
* Many those applications are issued (10 applications)
* Ideal behavior is that applications run one by one to maximize throughput.
* However, all applications run simultaneously. As the result, AM instances 
occupy resources and prevent other tasks from starting. At worst case, most of 
resources are occupied by waiting AMs and applications progress very slowly.

A solution is setting maxRunningApps to 1 or 2. However, it doesn't work well 
if following workload exists at the same queue:

* An application requests 2 containers where a container requires (X / 32) MB 
of memory
* Many those applications are issued (say, 10 applications)
* Ideal behavior is that all applications run simultaneously to maximize 
concurrency and throughput.
* However, number of applications are limited by maxRunningApps. At worst case, 
most of resources are idling.

This problem happens especially with Hive because we can't estimate size of a 
MapReduce application.

Solution:
AM doesn't have to start if there are waiting resource requests because the AM 
can't grant resource requests even if it starts.

Patch:
I attached a patch that implements this behavior. But this implementation has 
this trade-off:

* When AM is registered to FairScheduler, its demand is 0 because even AM 
attempt is not created. Starting this AM doesn't change resource demand of a 
queue. So, if many AMs are issued to a queue at the same time, all AMs will be 
RUNNING. But we want to prevent it.
* When a AM starts, demand of the AM is only AM attempt. Then AM requires more 
resources. Until AM requires resources, demand of the queue is low. But 
starting AM during this time will start unnecessary AMs. 
* So, this patch doesn't start immediately when AM is registered. Instead, it 
starts AM only every continuous-scheduling-sleep-ms.
* Setting large continuous-scheduling-sleep-ms will prevent wasting AMs. But 
increases latency.

Therefore, this patch is enabled only if new option "demand-block-am-enabled" 
is true.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to