dcoliversun commented on code in PR #38131: URL: https://github.com/apache/spark/pull/38131#discussion_r989182776
########## docs/configuration.md: ########## @@ -349,6 +349,23 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.executor.allowSparkContext</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2206-L2211 ########## docs/configuration.md: ########## @@ -468,6 +485,43 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.decommission.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2133-L2144 ########## docs/configuration.md: ########## @@ -847,6 +911,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.1.0</td> </tr> +<tr> + <td><code>spark.plugins</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1715-L1723 ########## docs/configuration.md: ########## @@ -1028,6 +1128,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.1.1</td> </tr> +<tr> + <td><code>spark.shuffle.sort.io.plugin.class</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1320-L1325 ########## docs/configuration.md: ########## @@ -1102,6 +1262,22 @@ Apart from these, the following properties are also available, and may be useful </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.shuffle.service.db.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L710-L716 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> + <td>512k</td> + <td> + The size at which we use Broadcast to send the map output statuses to the executors. + </td> + <td>2.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt</code></td> + <td>true</td> + <td> + Whether to detect any corruption in fetched blocks. + </td> + <td>2.2.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt.useExtraMemory</code></td> + <td>false</td> + <td> + If enabled, part of a compressed/encrypted stream will be de-compressed/de-crypted by using extra memory + to detect early corruption. Any IOException thrown will cause the task to be retried once + and if it fails again with same exception, then FetchFailedException will be thrown to retry previous stage. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.useOldFetchProtocol</code></td> + <td>false</td> + <td> + Whether to use the old protocol while doing the shuffle block fetching. It is only enabled while we need the + compatibility in the scenario of new Spark version job fetching shuffle blocks from old version external shuffle service. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.readHostLocalDisk</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1611-L1618 ########## docs/configuration.md: ########## @@ -468,6 +485,43 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.decommission.enabled</code></td> + <td>false</td> + <td> + When decommission enabled, Spark will try its best to shut down the executor gracefully. + Spark will try to migrate all the RDD blocks (controlled by <code>spark.storage.decommission.rddBlocks.enabled</code>) + and shuffle blocks (controlled by <code>spark.storage.decommission.shuffleBlocks.enabled</code>) from the decommissioning + executor to a remote executor when <code>spark.storage.decommission.enabled</code> is enabled. + With decommission enabled, Spark will also decommission an executor instead of killing when <code>spark.dynamicAllocation.enabled</code> enabled. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.killInterval</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2146-L2156 ########## docs/configuration.md: ########## @@ -1891,6 +2093,24 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.0.0</td> </tr> +<tr> + <td><code>spark.files.ignoreCorruptFiles</code></td> + <td>false</td> + <td> + Whether to ignore corrupt files. If true, the Spark jobs will continue to run when encountering corrupted or + non-existing files and contents that have been read will still be returned. + </td> + <td>2.1.0</td> +</tr> +<tr> + <td><code>spark.files.ignoreMissingFiles</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1081-L1086 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> + <td>8</td> + <td> + Maximum number of threads to use in migrating shuffle files. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.rddBlocks.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L474-L479 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> + <td>8</td> + <td> + Maximum number of threads to use in migrating shuffle files. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.rddBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer RDD blocks during block manager decommissioning. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.path</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L502-L510 ########## docs/configuration.md: ########## @@ -2321,6 +2630,16 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.4.1</td> </tr> +<tr> + <td><code>spark.standalone.submit.waitAppCompletion</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2197-L2204 ########## docs/configuration.md: ########## @@ -3360,6 +3688,15 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl </td> <td>3.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.push.merge.finalizeThreads</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2330-L2338 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> + <td>8</td> + <td> + Maximum number of threads to use in migrating shuffle files. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.rddBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer RDD blocks during block manager decommissioning. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.path</code></td> + <td>(none)</td> + <td> + The location for fallback storage during block manager decommissioning. For example, <code>s3a://spark-storage/</code>. + In case of empty, fallback storage is disabled. The storage should be managed by TTL because Spark will not clean it up. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.cleanUp</code></td> + <td>false</td> + <td> + If true, Spark cleans up its fallback storage data during shutting down. + </td> + <td>3.2.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxDiskSize</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L519-L528 ########## docs/configuration.md: ########## @@ -468,6 +485,43 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.decommission.enabled</code></td> + <td>false</td> + <td> + When decommission enabled, Spark will try its best to shut down the executor gracefully. + Spark will try to migrate all the RDD blocks (controlled by <code>spark.storage.decommission.rddBlocks.enabled</code>) + and shuffle blocks (controlled by <code>spark.storage.decommission.shuffleBlocks.enabled</code>) from the decommissioning + executor to a remote executor when <code>spark.storage.decommission.enabled</code> is enabled. + With decommission enabled, Spark will also decommission an executor instead of killing when <code>spark.dynamicAllocation.enabled</code> enabled. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.killInterval</code></td> + <td>(none)</td> + <td> + Duration after which a decommissioned executor will be killed forcefully by an outside (e.g. non-spark) service. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.forceKillTimeout</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2158-L2165 ########## docs/configuration.md: ########## @@ -681,14 +735,24 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td><code>spark.redaction.regex</code></td> - <td>(?i)secret|password|token</td> + <td>(?i)secret|password|token|access[.]key</td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1116-L1124 ########## docs/configuration.md: ########## @@ -906,6 +978,23 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.4.0</td> </tr> +<tr> + <td><code>spark.shuffle.unsafe.file.output.buffer</code></td> + <td>32k</td> + <td> + The file system for this buffer size after each partition is written in unsafe shuffle writer. + In KiB unless otherwise specified. + </td> + <td>2.3.0</td> +</tr> +<tr> + <td><code>spark.shuffle.spill.diskWriteBufferSize</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1350-L1358 ########## docs/configuration.md: ########## @@ -468,6 +485,43 @@ of the most common options to set are: </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.decommission.enabled</code></td> + <td>false</td> + <td> + When decommission enabled, Spark will try its best to shut down the executor gracefully. + Spark will try to migrate all the RDD blocks (controlled by <code>spark.storage.decommission.rddBlocks.enabled</code>) + and shuffle blocks (controlled by <code>spark.storage.decommission.shuffleBlocks.enabled</code>) from the decommissioning + executor to a remote executor when <code>spark.storage.decommission.enabled</code> is enabled. + With decommission enabled, Spark will also decommission an executor instead of killing when <code>spark.dynamicAllocation.enabled</code> enabled. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.killInterval</code></td> + <td>(none)</td> + <td> + Duration after which a decommissioned executor will be killed forcefully by an outside (e.g. non-spark) service. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.forceKillTimeout</code></td> + <td>(none)</td> + <td> + Duration after which a Spark will force a decommissioning executor to exit. + This should be set to a high value in most situations as low values will prevent block migrations from having enough time to complete. + </td> + <td>3.2.0</td> +</tr> +<tr> + <td><code>spark.executor.decommission.signal</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2167-L2172 ########## docs/configuration.md: ########## @@ -988,6 +1077,17 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.service.name</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L731-L739 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1534-L1539 ########## docs/configuration.md: ########## @@ -681,14 +735,24 @@ Apart from these, the following properties are also available, and may be useful </tr> <tr> <td><code>spark.redaction.regex</code></td> - <td>(?i)secret|password|token</td> + <td>(?i)secret|password|token|access[.]key</td> <td> Regex to decide which Spark configuration properties and environment variables in driver and executor environments contain sensitive information. When this regex matches a property key or value, the value is redacted from the environment UI and various logs like YARN and event logs. </td> <td>2.1.2</td> </tr> +<tr> + <td><code>spark.redaction.string.regex</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1126-L1133 ########## docs/configuration.md: ########## @@ -906,6 +978,23 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.4.0</td> </tr> +<tr> + <td><code>spark.shuffle.unsafe.file.output.buffer</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1339-L1348 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1541-L1546 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> + <td>512k</td> + <td> + The size at which we use Broadcast to send the map output statuses to the executors. + </td> + <td>2.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt</code></td> + <td>true</td> + <td> + Whether to detect any corruption in fetched blocks. + </td> + <td>2.2.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt.useExtraMemory</code></td> + <td>false</td> + <td> + If enabled, part of a compressed/encrypted stream will be de-compressed/de-crypted by using extra memory + to detect early corruption. Any IOException thrown will cause the task to be retried once + and if it fails again with same exception, then FetchFailedException will be thrown to retry previous stage. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.useOldFetchProtocol</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1602-L1609 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> + <td>512k</td> + <td> + The size at which we use Broadcast to send the map output statuses to the executors. + </td> + <td>2.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1554-L1559 ########## docs/configuration.md: ########## @@ -1735,6 +1911,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.6.0</td> </tr> +<tr> + <td><code>spark.storage.unrollMemoryThreshold</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L400-L405 ########## docs/configuration.md: ########## @@ -1063,6 +1171,58 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.3.0</td> </tr> +<tr> + <td><code>spark.shuffle.reduceLocality.enabled</code></td> + <td>true</td> + <td> + Whether to compute locality preferences for reduce tasks. + </td> + <td>1.5.0</td> +</tr> +<tr> + <td><code>spark.shuffle.mapOutput.minSizeForBroadcast</code></td> + <td>512k</td> + <td> + The size at which we use Broadcast to send the map output statuses to the executors. + </td> + <td>2.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt</code></td> + <td>true</td> + <td> + Whether to detect any corruption in fetched blocks. + </td> + <td>2.2.0</td> +</tr> +<tr> + <td><code>spark.shuffle.detectCorrupt.useExtraMemory</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1561-L1569 ########## docs/configuration.md: ########## @@ -1102,6 +1262,22 @@ Apart from these, the following properties are also available, and may be useful </td> <td>3.0.0</td> </tr> +<tr> + <td><code>spark.shuffle.service.db.enabled</code></td> + <td>true</td> + <td> + Whether to use db in ExternalShuffleService. Note that this only affects standalone mode. + </td> + <td>3.0.0</td> +</tr> +<tr> + <td><code>spark.shuffle.service.db.backend</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L718-L726 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L451-L456 ########## docs/configuration.md: ########## @@ -1816,6 +2010,14 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.1.1</td> </tr> +<tr> + <td><code>spark.broadcast.UDFCompressionThreshold</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1934-L1941 ########## docs/configuration.md: ########## @@ -1745,6 +1929,16 @@ Apart from these, the following properties are also available, and may be useful </td> <td>2.2.0</td> </tr> +<tr> + <td><code>spark.storage.localDiskByExecutors.cacheSize</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1620-L1629 ########## docs/configuration.md: ########## @@ -1891,6 +2093,24 @@ Apart from these, the following properties are also available, and may be useful </td> <td>1.0.0</td> </tr> +<tr> + <td><code>spark.files.ignoreCorruptFiles</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L1073-L1079 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L466-L472 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L458-L464 ########## docs/configuration.md: ########## @@ -3342,6 +3661,15 @@ Push-based shuffle helps improve the reliability and performance of spark shuffl </td> <td>3.2.0</td> </tr> +<tr> + <td><code>spark.shuffle.push.numPushThreads</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2301-L2308 ########## docs/configuration.md: ########## @@ -1944,6 +2164,67 @@ Apart from these, the following properties are also available, and may be useful </td> <td>0.9.2</td> </tr> +<tr> + <td><code>spark.storage.decommission.enabled</code></td> + <td>false</td> + <td> + Whether to decommission the block manager when decommissioning executor. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer shuffle blocks during block manager decommissioning. Requires a migratable shuffle resolver + (like sort based shuffle). + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.shuffleBlocks.maxThreads</code></td> + <td>8</td> + <td> + Maximum number of threads to use in migrating shuffle files. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.rddBlocks.enabled</code></td> + <td>true</td> + <td> + Whether to transfer RDD blocks during block manager decommissioning. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.path</code></td> + <td>(none)</td> + <td> + The location for fallback storage during block manager decommissioning. For example, <code>s3a://spark-storage/</code>. + In case of empty, fallback storage is disabled. The storage should be managed by TTL because Spark will not clean it up. + </td> + <td>3.1.0</td> +</tr> +<tr> + <td><code>spark.storage.decommission.fallbackStorage.cleanUp</code></td> Review Comment: https://github.com/apache/spark/blob/22483167e20208e40e24abe6898b2102ddaf4fc9/core/src/main/scala/org/apache/spark/internal/config/package.scala#L512-L517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
