[
https://issues.apache.org/jira/browse/SPARK-32037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195531#comment-17195531
]
Thomas Graves commented on SPARK-32037:
---------------------------------------
it is a good point about blocklist being typoed (but I would hope would be
caught in reviews) but if you are looking at amount of change it is only 1
character. Also I don't really see how BlocklistTracker sounds any worse then
BlacklistTracker. Both might be a bit weird. HealthTracker might be better
there although would be better if we could give context to what health and in
this case its either node or executor which is hard to give a name to that
includes both. Like you pointed out then you have TaskSetHealthTracker - which
isn't really right because its tracking the health of the node/executor for
that taskset not the taskset itself.
If you look at the description to the config denied seems a bit weird to me:
_If set to "true", prevent Spark from scheduling tasks on executors that have
been blacklisted due to too many task failures. The blacklisting algorithm can
be further controlled by the other "spark.blacklist" configuration options._
If we look at the options in the context of this sentence...:
executor that have been denied due to too many task failures
executors that have been blocked due to too many task failures
executors that have been excluded due to to many task failures
The last 2 definitely make more sense in that context. Now you could
definitely re-write the sentence for denied, but the other thing is that
executors can be removed from the list so denied/allowed or removed from denied
doesn't make as much sense to me in this context. block or exclude make more
sense to me if they can go active again (blocked/unblocked or
excluded/included).
Naming things is always a pain. I think based on all the feedback if no one
has strong objections I will go with "blocklist". I'll start to make the
changes and should start to see in the context of this if it doesn't make
sense. Perhaps we can do a mix of things where the BlacklistTracker would be
renamed HealthTracker but other things internally are referred to as blocklist
or blocked.
> Rename blacklisting feature to avoid language with racist connotation
> ---------------------------------------------------------------------
>
> Key: SPARK-32037
> URL: https://issues.apache.org/jira/browse/SPARK-32037
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.0.1
> Reporter: Erik Krogen
> Priority: Minor
>
> As per [discussion on the Spark dev
> list|https://lists.apache.org/thread.html/rf6b2cdcba4d3875350517a2339619e5d54e12e66626a88553f9fe275%40%3Cdev.spark.apache.org%3E],
> it will be beneficial to remove references to problematic language that can
> alienate potential community members. One such reference is "blacklist".
> While it seems to me that there is some valid debate as to whether this term
> has racist origins, the cultural connotations are inescapable in today's
> world.
> I've created a separate task, SPARK-32036, to remove references outside of
> this feature. Given the large surface area of this feature and the
> public-facing UI / configs / etc., more care will need to be taken here.
> I'd like to start by opening up debate on what the best replacement name
> would be. Reject-/deny-/ignore-/block-list are common replacements for
> "blacklist", but I'm not sure that any of them work well for this situation.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]