This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 43ca0b929ab [SPARK-46299][DOCS] Make `spark.deploy.recovery*` docs
up-to-date
43ca0b929ab is described below
commit 43ca0b929ab3c2f10d1879e5df622195564f8885
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Wed Dec 6 19:19:41 2023 -0800
[SPARK-46299][DOCS] Make `spark.deploy.recovery*` docs up-to-date
### What changes were proposed in this pull request?
This PR aims to update `Spark Standalone` cluster recovery configurations.
### Why are the changes needed?
We need to document
- #44173
- #44129
- #44113

### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manual review.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #44227 from dongjoon-hyun/SPARK-46299.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
docs/spark-standalone.md | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/docs/spark-standalone.md b/docs/spark-standalone.md
index 7a89c8124bd..25d2fba47ce 100644
--- a/docs/spark-standalone.md
+++ b/docs/spark-standalone.md
@@ -735,18 +735,38 @@ In order to enable this recovery mode, you can set
SPARK_DAEMON_JAVA_OPTS in spa
<tr>
<td><code>spark.deploy.recoveryMode</code></td>
<td>NONE</td>
- <td>The recovery mode setting to recover submitted Spark jobs with cluster
mode when it failed and relaunches.
- Set to FILESYSTEM to enable single-node recovery mode, ZOOKEEPER to use
Zookeeper-based recovery mode, and
+ <td>The recovery mode setting to recover submitted Spark jobs with cluster
mode when it failed and relaunches. Set to
+ FILESYSTEM to enable file-system-based single-node recovery mode,
+ ROCKSDB to enable RocksDB-based single-node recovery mode,
+ ZOOKEEPER to use Zookeeper-based recovery mode, and
CUSTOM to provide a customer provider class via additional
`spark.deploy.recoveryMode.factory` configuration.
+ NONE is the default value which disables this recovery mode.
</td>
<td>0.8.1</td>
</tr>
<tr>
<td><code>spark.deploy.recoveryDirectory</code></td>
<td>""</td>
- <td>The directory in which Spark will store recovery state, accessible
from the Master's perspective.</td>
+ <td>The directory in which Spark will store recovery state, accessible
from the Master's perspective.
+ Note that the directory should be clearly manualy if
<code>spark.deploy.recoveryMode</code>,
+ <code>spark.deploy.recoverySerializer</code>, or
<code>spark.deploy.recoveryCompressionCodec</code> is changed.
+ </td>
<td>0.8.1</td>
</tr>
+ <tr>
+ <td><code>spark.deploy.recoverySerializer</code></td>
+ <td>JAVA</td>
+ <td>A serializer for writing/reading objects to/from persistence engines;
JAVA (default) or KRYO.
+ Java serializer has been the default mode since Spark 0.8.1.
+ Kryo serializer is a new fast and compact mode from Spark 4.0.0.</td>
+ <td>4.0.0</td>
+ </tr>
+ <tr>
+ <td><code>spark.deploy.recoveryCompressionCodec</code></td>
+ <td>(none)</td>
+ <td>A compression codec for persistence engines. none (default), lz4, lzf,
snappy, and zstd. Currently, only FILESYSTEM mode supports this
configuration.</td>
+ <td>4.0.0</td>
+ </tr>
<tr>
<td><code>spark.deploy.recoveryMode.factory</code></td>
<td>""</td>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]