mridulm commented on code in PR #48960:
URL: https://github.com/apache/spark/pull/48960#discussion_r1885888412


##########
core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala:
##########
@@ -155,6 +153,23 @@ private[spark] object FallbackStorage extends Logging {
       FALLBACK_BLOCK_MANAGER_ID, blockId, StorageLevel.DISK_ONLY, memSize = 0, 
dataLength)
   }
 
+  /**
+   * Provide the Path for a shuffle file.
+   */
+  private[storage] def getPath(conf: SparkConf,
+                               appId: String,
+                               shuffleId: Int,
+                               filename: String): Path = {
+    val fallbackPath = new 
Path(conf.get(STORAGE_DECOMMISSION_FALLBACK_STORAGE_PATH).get)
+    val subPaths = conf.get(STORAGE_DECOMMISSION_FALLBACK_STORAGE_SUBPATHS)

Review Comment:
   It would have been better to pull these `conf.get` as fields and move method 
into the class ... but looks like `FallbackStorage.read` needs it too 
unfortunately.



##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -590,12 +590,24 @@ package object config {
     ConfigBuilder("spark.storage.decommission.fallbackStorage.path")
       .doc("The location for fallback storage during block manager 
decommissioning. " +
         "For example, `s3a://spark-storage/`. In case of empty, fallback 
storage is disabled. " +
-        "The storage should be managed by TTL because Spark will not clean it 
up.")
+        "The storage should be managed by TTL because Spark will not clean it 
up, " +
+        "unless spark.storage.decommission.fallbackStorage.cleanUp is true.")
       .version("3.1.0")
       .stringConf
       .checkValue(_.endsWith(java.io.File.separator), "Path should end with 
separator.")
       .createOptional
 
+  private[spark] val STORAGE_DECOMMISSION_FALLBACK_STORAGE_SUBPATHS =
+    ConfigBuilder("spark.storage.decommission.fallbackStorage.subPaths")
+      .doc("The fallback storage puts all files of one shuffle in one 
directory when this is 0. " +
+        "When this option is larger than 0, it will instead distribute the 
files across " +
+        "this number of subdirectories.")
+      .version("4.0.0")
+      .intConf
+      .checkValue(_ >= 0, "The number of subdirectories must be 0 or larger.")

Review Comment:
   `_ > 0` instead ? `subPaths == 1` gives us the behavior the of `0` (except 
it will be one level deeper)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to