tgravescs commented on a change in pull request #23616: [SPARK-26688][YARN] Provide configuration of initially blacklisted YARN nodes URL: https://github.com/apache/spark/pull/23616#discussion_r255776640
########## File path: docs/running-on-yarn.md ########## @@ -462,6 +462,14 @@ To use a custom metrics.properties for the application master and executors, upd <code>spark.blacklist.application.maxFailedExecutorsPerNode</code>. </td> </tr> +<tr> + <td><code>spark.yarn.blacklist.initial.blacklisted.nodes</code></td> + <td>false</td> + <td> + Comma-separated list of strings used as initially blacklisted YARN nodes which stays always Review comment: I'm wondering if we change the name/description a bit. a few things I"m concerned with is it says blacklisted which generally has the timeout apply, the second concern is it says initial, which someone could take as meaning initially but maybe perhaps later could be. We are adding these to the yarn blacklisting so blacklisted makes sense in that context, but perhaps we should say spark.yarn.blacklist.always.blacklisted.nodes. I always hate finding good names for configs. I'm not fond of the 2 blacklist(ed) in the name of the config. thought about spark.yarn.scheduler.always.blacklisted.nodes? We could go with something like spark.yarn.exclude.nodes as well since that is essentially what we are doing @squito thoughts? We may want to clarify the description saying list of nodenames The default isn't false here, so we should fix that. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
