(spark) branch master updated: [SPARK-53926][DOCS] Document newly added `core` module configurations

dongjoon Fri, 17 Oct 2025 17:37:15 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 032dcf89c19b [SPARK-53926][DOCS] Document newly added `core` module 
configurations
032dcf89c19b is described below

commit 032dcf89c19bf05c550b90edd9491f3f0a756523
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Wed Oct 15 19:02:34 2025 -0700

    [SPARK-53926][DOCS] Document newly added `core` module configurations
    
    ### What changes were proposed in this pull request?
    
    This PR aims to document newly added `core` module configurations as a part 
of Apache Spark 4.1.0 preparation.
    
    ### Why are the changes needed?
    
    To help the users use new features easily.
    
    - https://github.com/apache/spark/pull/47856
    - https://github.com/apache/spark/pull/51130
    - https://github.com/apache/spark/pull/51163
    - https://github.com/apache/spark/pull/51604
    - https://github.com/apache/spark/pull/51630
    - https://github.com/apache/spark/pull/51708
    - https://github.com/apache/spark/pull/51885
    - https://github.com/apache/spark/pull/52091
    - https://github.com/apache/spark/pull/52382
    
    ### Does this PR introduce _any_ user-facing change?
    
    No behavior change because this is a documentation update.
    
    ### How was this patch tested?
    
    Manual review.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #52626 from dongjoon-hyun/SPARK-53926.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 docs/configuration.md | 109 ++++++++++++++++++++++++++++++++++++++++++++++++++
 docs/monitoring.md    |   8 ++++
 2 files changed, 117 insertions(+)

diff --git a/docs/configuration.md b/docs/configuration.md
index b999a6ee2577..e9dbfa2b4f03 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -523,6 +523,16 @@ of the most common options to set are:
   </td>
   <td>3.0.0</td>
 </tr>
+<tr>
+  <td><code>spark.driver.log.redirectConsoleOutputs</code></td>
+  <td>stdout,stderr</td>
+  <td>
+    Comma-separated list of the console output kind for driver that needs to 
redirect
+    to logging system. Supported values are `stdout`, `stderr`. It only takes 
affect when
+    `spark.plugins` is configured with 
`org.apache.spark.deploy.RedirectConsolePlugin`.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.decommission.enabled</code></td>
   <td>false</td>
@@ -772,6 +782,16 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>1.1.0</td>
 </tr>
+<tr>
+  <td><code>spark.executor.logs.redirectConsoleOutputs</code></td>
+  <td>stdout,stderr</td>
+  <td>
+    Comma-separated list of the console output kind for executor that needs to 
redirect
+    to logging system. Supported values are `stdout`, `stderr`. It only takes 
affect when
+    `spark.plugins` is configured with 
`org.apache.spark.deploy.RedirectConsolePlugin`.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.executor.userClassPathFirst</code></td>
   <td>false</td>
@@ -857,6 +877,47 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>1.2.0</td>
 </tr>
+<tr>
+  <td><code>spark.python.factory.idleWorkerMaxPoolSize</code></td>
+  <td>(none)</td>
+  <td>
+    Maximum number of idle Python workers to keep. If unset, the number is 
unbounded.
+    If set to a positive integer N, at most N idle workers are retained;
+    least-recently used workers are evicted first.
+  </td>
+  <td>4.1.0</td>
+</tr>
+<tr>
+  <td><code>spark.python.worker.killOnIdleTimeout</code></td>
+  <td>false</td>
+  <td>
+    Whether Spark should terminate the Python worker process when the idle 
timeout
+    (as defined by <code>spark.python.worker.idleTimeoutSeconds</code>) is 
reached. If enabled,
+    Spark will terminate the Python worker process in addition to logging the 
status.
+  </td>
+  <td>4.1.0</td>
+</tr>
+<tr>
+  <td><code>spark.python.worker.tracebackDumpIntervalSeconds</code></td>
+  <td>0</td>
+  <td>
+    The interval (in seconds) for Python workers to dump their tracebacks.
+    If it's positive, the Python worker will periodically dump the traceback 
into
+    its `stderr`. The default is `0` that means it is disabled.
+  </td>
+  <td>4.1.0</td>
+</tr>
+<tr>
+  <td><code>spark.python.unix.domain.socket.enabled</code></td>
+  <td>false</td>
+  <td>
+    When set to true, the Python driver uses a Unix domain socket for 
operations like
+    creating or collecting a DataFrame from local data, using accumulators, 
and executing
+    Python functions with PySpark such as Python UDFs. This configuration only 
applies
+    to Spark Classic and Spark Connect server.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.files</code></td>
   <td></td>
@@ -873,6 +934,16 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>1.0.1</td>
 </tr>
+<tr>
+  <td><code>spark.submit.callSystemExitOnMainExit</code></td>
+  <td>false</td>
+  <td>
+    If true, SparkSubmit will call System.exit() to initiate JVM shutdown once 
the
+    user's main method has exited. This can be useful in cases where 
non-daemon JVM
+    threads might otherwise prevent the JVM from shutting down on its own.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.jars</code></td>
   <td></td>
@@ -1431,6 +1502,14 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>3.0.0</td>
 </tr>
+<tr>
+  <td><code>spark.eventLog.excludedPatterns</code></td>
+  <td>(none)</td>
+  <td>
+    Specifies comma-separated event names to be excluded from the event logs.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.eventLog.dir</code></td>
   <td>file:///tmp/spark-events</td>
@@ -1905,6 +1984,15 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>3.2.0</td>
 </tr>
+<tr>
+  <td><code>spark.io.compression.zstd.strategy</code></td>
+  <td>(none)</td>
+  <td>
+    Compression strategy for Zstd compression codec. The higher the value is, 
the more
+    complex it becomes, usually resulting stronger but slower compression or 
higher CPU cost.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.io.compression.zstd.workers</code></td>
   <td>0</td>
@@ -2092,6 +2180,17 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+  <td><code>spark.memory.unmanagedMemoryPollingInterval</code></td>
+  <td>0s</td>
+  <td>
+    Interval for polling unmanaged memory users to track their memory usage.
+    Unmanaged memory users are components that manage their own memory outside 
of
+    Spark's core memory management, such as RocksDB for Streaming State Store.
+    Setting this to 0 disables unmanaged memory polling.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.storage.unrollMemoryThreshold</code></td>
   <td>1024 * 1024</td>
@@ -2543,6 +2642,16 @@ Apart from these, the following properties are also 
available, and may be useful
   </td>
   <td>0.7.0</td>
 </tr>
+<tr>
+  <td><code>spark.driver.metrics.pollingInterval</code></td>
+  <td>10s</td>
+  <td>
+    How often to collect driver metrics (in milliseconds).
+    If unset, the polling is done at the executor heartbeat interval.
+    If set, the polling is done at this interval.
+  </td>
+  <td>4.1.0</td>
+</tr>
 <tr>
   <td><code>spark.rpc.io.backLog</code></td>
   <td>64</td>
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 49d04b328f29..e75f83110d19 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -401,6 +401,14 @@ Security options for the Spark History Server are covered 
more detail in the
     </td>
     <td>3.0.0</td>
   </tr>
+  <tr>
+    <td>spark.history.fs.eventLog.rolling.onDemandLoadEnabled</td>
+    <td>true</td>
+    <td>
+      Whether to look up rolling event log locations on demand manner before 
listing files.
+    </td>
+    <td>4.1.0</td>
+  </tr>
   <tr>
     <td>spark.history.store.hybridStore.enabled</td>
     <td>false</td>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-53926][DOCS] Document newly added `core` module configurations

Reply via email to