Tao Yang created YARN-7004:
------------------------------

             Summary: Add configs cache to optimize refreshQueues performance 
for large scale queues
                 Key: YARN-7004
                 URL: https://issues.apache.org/jira/browse/YARN-7004
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacityscheduler
    Affects Versions: 3.0.0-alpha4, 2.9.0
            Reporter: Tao Yang
            Assignee: Tao Yang


We have requirements for large scale queues in our production environment to 
serve for many projects. So we did some tests for more than 5000 queues and 
found that refreshQueues process took more than 1 minute. The refreshQueues 
process costs most of time on iterating over all configurations to get 
accessible-node-labels and ordering-policy configs for every queue.  
Loading queue configs from cache should be beneficial to reduce time costs 
(optimized from 1 minutes to 3 seconds for 5000 queues in our test) when 
initializing/reinitializing queues. So I propose to load queue configs into 
cache in CapacityScheduler#initializeQueues and 
CapacityScheduler#reinitializeQueues. If cache has not be loaded on other 
scenes, such as in test cases, it still can get queue configs by iterating over 
all configurations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to