Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/14079#discussion_r72300112
--- Diff:
core/src/main/scala/org/apache/spark/internal/config/package.scala ---
@@ -97,6 +97,49 @@ package object config {
.toSequence
.createWithDefault(Nil)
+ // Blacklist confs
+ private[spark] val BLACKLIST_ENABLED =
+ ConfigBuilder("spark.scheduler.blacklist.enabled")
+ .booleanConf
+ .createOptional
+
+ private[spark] val MAX_TASK_ATTEMPTS_PER_NODE =
+ ConfigBuilder("spark.blacklist.maxTaskAttemptsPerNode")
--- End diff --
hmm, I don't think "worker" is the right name. First, I think you can have
multiple worker instances per node (SPARK_WORKER_INSTANCES). Second, at least
to me, "worker" by itself refers to a specific process that is just part of
spark standalone mode. OTOH, I scanned through the docs to try to find
definitive evidence one way or another, and it seems that we're not
super-consistent. We do define "worker node" here:
http://spark.apache.org/docs/latest/cluster-overview.html, and I do see that
phrase used together often in the docs, but I think its kind of verbose for a
conf.
As I was writing it, I did wonder a lot if I should "node" or "host", it
seems like internally spark uses both a lot, and very interchangeably. I stuck
w/ "node" for all confs, and I think I swapped between node and host in the
code.
We use "node" in other confs, eg. "spark.locality.wait.node", and I see it
used the most in http://spark.apache.org/docs/latest/configuration.html and
http://spark.apache.org/docs/latest/job-scheduling.html
Some relatively meaningless metrics.
In docs dir:
```
> find . -type f | xargs grep -i "worker" | wc -l
268
> find . -type f | xargs grep -i "node" | wc -l
401
> find . -type f | xargs grep -i "host" | wc -l
209
> find . -type f | xargs grep -i "worker" | grep -v -i "worker node" | wc -l
222
```
in source code:
```
> find . -type f -name "*.scala" | xargs grep -i "worker" | wc -l
1247
> find . -type f -name "*.scala" | xargs grep -i "node" | wc -l
2630
> find . -type f -name "*.scala" | xargs grep -i "host" | wc -l
2222
> find . -type f -name "*.scala" | xargs grep -i "worker node" | wc -l
23
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]