zhengchenyu created YARN-6407:
---------------------------------

             Summary: Improve and fix locks of RM scheduler
                 Key: YARN-6407
                 URL: https://issues.apache.org/jira/browse/YARN-6407
             Project: Hadoop YARN
          Issue Type: Bug
          Components: fairscheduler
    Affects Versions: 2.7.1
         Environment: CentOS 7, 1 Gigabit Ethernet
            Reporter: zhengchenyu
             Fix For: 2.7.1


First,this issue dose not duplicate the YARN-3091.
In our cluster, we have 5k nodes, and the server is configured with 1 Gigabit 
Ethernet. So network is bottleneck in our cluster.
We must distcp data from warehouse, because of 1 Gigabit Ethernet, we must set 
yarn.scheduler.fair.max.assign to 5, or must lead to hotspot.
The setting that max.assign is 5 lead to the assigned ability decreased. So we 
start the ContinuousSchedulingThread. 
As more applicaitons running in our cluster , and with 
ContinuousSchedulingThread, the problem of lock contention is more serious. 
In our cluster, the callqueue of ApplicationMasterSeriver's rpc is high 
occasionally. we worried that more problem occure in future with more 
application are running.
Here is our logical graph:
"1 Gigabit Ethernet" and "data hot spot" ==> "set 
yarn.scheduler.fair.max.assign to 5" ==> "ContinuousSchedulingThread is 
started" and "more applcations" => "lock contention"
I know YARN-3091 solved this problem, but the patch aims that change the object 
lock to read write lock. This change is still Coarse-Grained. So I think we 
lock the resources or not lock the large section code.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to