(spark) branch master updated: [SPARK-53896][CORE] Enable `spark.io.compression.lzf.parallel.enabled` by default

dongjoon Sat, 18 Oct 2025 06:06:43 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new e6a76dfc0f00 [SPARK-53896][CORE] Enable 
`spark.io.compression.lzf.parallel.enabled` by default
e6a76dfc0f00 is described below

commit e6a76dfc0f00cb2be3e5a50a15682c9f2a863067
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Mon Oct 13 23:19:37 2025 -0700

    [SPARK-53896][CORE] Enable `spark.io.compression.lzf.parallel.enabled` by 
default
    
    ### What changes were proposed in this pull request?
    
    This PR aims to enable `spark.io.compression.lzf.parallel.enabled` by 
default at Apache Spark 4.1.0.
    
    ### Why are the changes needed?
    
    `spark.io.compression.lzf.parallel.enabled` was introduced at Apache Spark 
4.0.0 and has been used stably so far. We can enable this by default.
    - https://github.com/apache/spark/pull/46858
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes for `LZF` users. The migration guide is updated.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #52603 from dongjoon-hyun/SPARK-53896.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 core/src/main/scala/org/apache/spark/internal/config/package.scala | 2 +-
 docs/configuration.md                                              | 2 +-
 docs/core-migration-guide.md                                       | 1 +
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index d413d06ffc94..94fe31e1cd8c 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -2137,7 +2137,7 @@ package object config {
       .doc("When true, LZF compression will use multiple threads to compress 
data in parallel.")
       .version("4.0.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   private[spark] val IO_WARNING_LARGEFILETHRESHOLD =
     ConfigBuilder("spark.io.warning.largeFileThreshold")
diff --git a/docs/configuration.md b/docs/configuration.md
index 573b485f7e2d..b999a6ee2577 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -1918,7 +1918,7 @@ Apart from these, the following properties are also 
available, and may be useful
 </tr>
 <tr>
   <td><code>spark.io.compression.lzf.parallel.enabled</code></td>
-  <td>false</td>
+  <td>true</td>
   <td>
     When true, LZF compression will use multiple threads to compress data in 
parallel.
   </td>
diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index a738363ace1d..19b77624d626 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -29,6 +29,7 @@ license: |
 - Since Spark 4.1, Spark uses Apache Hadoop Magic Committer for all S3 buckets 
by default. To restore the behavior before Spark 4.0, you can set 
`spark.hadoop.fs.s3a.committer.magic.enabled=false`.
 - Since Spark 4.1, `java.lang.InternalError` encountered during file reading 
will no longer fail the task if the configuration 
`spark.sql.files.ignoreCorruptFiles` or the data source option 
`ignoreCorruptFiles` is set to `true`. 
 - Since Spark 4.1, Spark ignores `*.blacklist.*` alternative configuration 
names. To restore the behavior before Spark 4.1, you can use the corresponding 
configuration names instead which exists since Spark 3.1.0.
+- Since Spark 4.1, Spark will use multiple threads for LZF compression to 
compress data in parallel. To restore the behavior before Spark 4.1, you can 
set `spark.io.compression.lzf.parallel.enabled` to `false`.
 
 ## Upgrading from Core 3.5 to 4.0
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-53896][CORE] Enable `spark.io.compression.lzf.parallel.enabled` by default

Reply via email to