We have seen 2 cases mentioned below, where, it would have been nice if Apex allowed us to exclude a node from the cluster for an application.
1. A node in the cluster had gone bad (was randomly rebooting) and so an Apex app should not use it - other apps can use it as they were batch jobs. 2. A node is being used for a mission critical app (Could be an Apex app itself), but another Apex app which is mission critical should not be using resources on that node. Can we have a way in which, Stram and YARN can coordinate between each other to not use a set of nodes for the application. It an be done in 2 way s- 1. Have a list of "exclude" nodes with Stram- when YARN allcates resources on either of these, STRAM rejects and gets resources allocated again frm YARN 2. Have a list of nodes that can be used for an app - This can be a part of config. Hwever, I don't think this would be a right way to do so as we will need support from YARN as well. Further, this might be difficult to change at runtim if need be. Any thoughts? -- ~Milind bee at gee mail dot com