Github user yucai commented on a diff in the pull request:
https://github.com/apache/spark/pull/10225#discussion_r47182068
--- Diff:
core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ---
@@ -80,6 +77,65 @@ private[spark] class DiskBlockManager(blockManager:
BlockManager, conf: SparkCon
new File(subDir, filename)
}
+ def getFileDefault(filename: String): File = {
+ _getFile(filename, localDirs)
+ }
+ var getFile = getFileDefault _
+
+ private val hierarchyStore =
conf.getOption("spark.storage.hierarchyStore")
+ if (hierarchyStore.isDefined) {
+ val HSLayers: Array[(String, Long)] =
+ hierarchyStore.get.trim.split(",").map {
+ s => val x = s.trim.split(" +")
+ (x(0), Utils.byteStringAsGb(x(1)))
+ }
+ val HSLayerDirs: Array[Array[File]] = HSLayers.map(
--- End diff --
Sorry about misunderstanding! The devices in spark.storage.hierarchyStore
are from the fastest to slowest.
Generally, user needs two steps to configure the hierarchy store.
1. Setup the priority and threshold for each layer.
```
spark.storage.hierarchyStore='nvm 50GB,ssd 80GB'
```
It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd",
all the rest form the last layer.
2. Configure each layer's location, user just needs put the keyword like
"nvm", "ssd", which are specified in step 1 into directories.
```
spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others
```
After then, restart your Spark application, it will allocate blocks from
nvm first.
When nvm's usable space is less than 50GB, it starts to allocate from ssd.
When ssd's usable space is less than 80GB, it starts to allocate from the
last layer.
More details in https://issues.apache.org/jira/browse/SPARK-12196
Is it still too confusing?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]