Beckham007 created YARN-2169:
--------------------------------

             Summary: NMSimulator of sls should catch more Exception
                 Key: YARN-2169
                 URL: https://issues.apache.org/jira/browse/YARN-2169
             Project: Hadoop YARN
          Issue Type: Bug
    Affects Versions: 2.4.0
            Reporter: Beckham007


In the method middleStep() of NMSimulator , sending heart beat may cause 
InterruptedException or other Exception if the load is heavily. If not handler 
these exceptions, the task of NMSimulator cloud not add to the executor queue 
again. So the NM will lost.
In my situation, the pool size is 4000, nm size is 2000, and am is 1500. Some 
NMs will lost.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to