Github user jiangxb1987 commented on a diff in the pull request:
https://github.com/apache/spark/pull/21390#discussion_r190098033
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
---
@@ -97,6 +97,10 @@ private[deploy] class Worker(
private val APP_DATA_RETENTION_SECONDS =
conf.getLong("spark.worker.cleanup.appDataTtl", 7 * 24 * 3600)
+ // Whether or not cleanup the non-shuffle files on executor finishes.
+ private val CLEANUP_NON_SHUFFLE_FILES_ENABLED =
+ conf.getBoolean("spark.worker.cleanup.nonShuffleFiles.enabled", true)
--- End diff --
Is it possible that a user want to cleanup non-shuffle files but don't want
to cleanup the whole application directories(may still make use of the shuffle
data)? For this specific use case, we should not make
`spark.worker.cleanup.enabled = true` a precondition of
`spark.worker.cleanup.nonShuffleFiles.enabled = true`. Otherwise, I would be
glad to make the change.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]