c21 commented on a change in pull request #31715:
URL: https://github.com/apache/spark/pull/31715#discussion_r586981565
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -2117,4 +2117,15 @@ package object config {
// batch of block will be loaded in memory with memory mapping, which
has higher overhead
// with small MB sized chunk of data.
.createWithDefaultString("3m")
+
+ private[spark] val MARK_FILE_LOST_ON_EXECUTOR_LOST =
+ ConfigBuilder("spark.shuffle.markFileLostOnExecutorLost")
Review comment:
I am having similar concern with @viirya and @attilapiros . I think we
should not make it as a user-facing config. If we would like introducing a
config for this anyway, it'd better to start with `internal()` config, but not
user-facing.
Though `spark.shuffle.manager` can be arbitrary class, but in practice, they
are just a handful of implementation in one company's environment, and ideally
the infra developers should control which implementation to use, instead of
user to control this. Similarly for whether to mark shuffle file lost should be
controlled by developers team but not users.
Just share some context, in FB, we (spark developer) control these kinds of
behavior transparently and make this invisible to spark user. This also helps
us to migrate to newer implementation easier without worrying about users
setting wrong config. The mixed case (query uses customized shuffle service and
default shuffle service) can happen quite a bit in production, as we have rate
limit for traffic on customized service, and need fallback.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]