mridulm commented on code in PR #48960:
URL: https://github.com/apache/spark/pull/48960#discussion_r1885888412
##########
core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala:
##########
@@ -155,6 +153,23 @@ private[spark] object FallbackStorage extends Logging {
FALLBACK_BLOCK_MANAGER_ID, blockId, StorageLevel.DISK_ONLY, memSize = 0,
dataLength)
}
+ /**
+ * Provide the Path for a shuffle file.
+ */
+ private[storage] def getPath(conf: SparkConf,
+ appId: String,
+ shuffleId: Int,
+ filename: String): Path = {
+ val fallbackPath = new
Path(conf.get(STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH).get)
+ val subPaths = conf.get(STORAGE_DECOMMISSION_FALLBACK_STORAGE_SUBPATHS)
Review Comment:
It would have been better to pull these `conf.get` as fields and move method
into the class ... but looks like `FallbackStorage.read` needs it too
unfortunately.
##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -590,12 +590,24 @@ package object config {
ConfigBuilder("spark.storage.decommission.fallbackStorage.path")
.doc("The location for fallback storage during block manager
decommissioning. " +
"For example, `s3a://spark-storage/`. In case of empty, fallback
storage is disabled. " +
- "The storage should be managed by TTL because Spark will not clean it
up.")
+ "The storage should be managed by TTL because Spark will not clean it
up, " +
+ "unless spark.storage.decommission.fallbackStorage.cleanUp is true.")
.version("3.1.0")
.stringConf
.checkValue(_.endsWith(java.io.File.separator), "Path should end with
separator.")
.createOptional
+ private[spark] val STORAGE_DECOMMISSION_FALLBACK_STORAGE_SUBPATHS =
+ ConfigBuilder("spark.storage.decommission.fallbackStorage.subPaths")
+ .doc("The fallback storage puts all files of one shuffle in one
directory when this is 0. " +
+ "When this option is larger than 0, it will instead distribute the
files across " +
+ "this number of subdirectories.")
+ .version("4.0.0")
+ .intConf
+ .checkValue(_ >= 0, "The number of subdirectories must be 0 or larger.")
Review Comment:
`_ > 0` instead ? `subPaths == 1` gives us the behavior the of `0` (except
it will be one level deeper)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]