[ https://issues.apache.org/jira/browse/YARN-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated YARN-1073: --------------------------------- Summary: NM to recognise when it can't spawn process and stop accepting containers (was: NM to recognise when it can't span process and stop accepting containers) > NM to recognise when it can't spawn process and stop accepting containers > ------------------------------------------------------------------------- > > Key: YARN-1073 > URL: https://issues.apache.org/jira/browse/YARN-1073 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 2.1.0-beta > Environment: OS/X with not enough file handles > Reporter: Steve Loughran > Priority: Minor > > when creating too many containers with a claimed resource use of 0 RAM or > vCores, the NM got to the state where exec() was continually failing -but > nothing seemed to recognise this and blacklist the node. > Something should be noting that all container launches for an app/container > are failing and do something. While AMs can/should code this, NM failure is > something at the YARN-level -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira