This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new bccdf1ffd467 [SPARK-50483][SPARK-50545][DOC][FOLLOWUP] Mention
behavior changes in migration guide
bccdf1ffd467 is described below
commit bccdf1ffd467cb60ca6e100c20a1b659102eb304
Author: Cheng Pan <[email protected]>
AuthorDate: Fri Dec 20 23:23:33 2024 +0800
[SPARK-50483][SPARK-50545][DOC][FOLLOWUP] Mention behavior changes in
migration guide
### What changes were proposed in this pull request?
Update migration guide for SPARK-50483 and SPARK-50545
### Why are the changes needed?
Mention behavior changes in migration guide
### Does this PR introduce _any_ user-facing change?
Yes, docs are updated.
### How was this patch tested?
Review.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #49252 from pan3793/SPARK-50483-SPARK-50545-followup.
Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
---
docs/core-migration-guide.md | 6 ++++++
docs/sql-migration-guide.md | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 958e442545dc..49737392312a 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -54,6 +54,12 @@ license: |
- Since Spark 4.0, `spark.shuffle.unsafe.file.output.buffer` is deprecated
though still works. Use `spark.shuffle.localDisk.file.output.buffer` instead.
+- Since Spark 4.0, when reading files hits
`org.apache.hadoop.security.AccessControlException` and
`org.apache.hadoop.hdfs.BlockMissingException`, the exception will be thrown
and fail the task, even if `spark.files.ignoreCorruptFiles` is set to `true`.
+
+## Upgrading from Core 3.5.3 to 3.5.4
+
+- Since Spark 3.5.4, when reading files hits
`org.apache.hadoop.security.AccessControlException` and
`org.apache.hadoop.hdfs.BlockMissingException`, the exception will be thrown
and fail the task, even if `spark.files.ignoreCorruptFiles` is set to `true`.
+
## Upgrading from Core 3.4 to 3.5
- Since Spark 3.5, `spark.yarn.executor.failuresValidityInterval` is
deprecated. Use `spark.executor.failuresValidityInterval` instead.
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 717d27befef0..254c54a414a7 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -29,6 +29,7 @@ license: |
- Since Spark 4.0, the default behaviour when inserting elements in a map is
changed to first normalize keys -0.0 to 0.0. The affected SQL functions are
`create_map`, `map_from_arrays`, `map_from_entries`, and `map_concat`. To
restore the previous behaviour, set
`spark.sql.legacy.disableMapKeyNormalization` to `true`.
- Since Spark 4.0, the default value of `spark.sql.maxSinglePartitionBytes` is
changed from `Long.MaxValue` to `128m`. To restore the previous behavior, set
`spark.sql.maxSinglePartitionBytes` to `9223372036854775807`(`Long.MaxValue`).
- Since Spark 4.0, any read of SQL tables takes into consideration the SQL
configs
`spark.sql.files.ignoreCorruptFiles`/`spark.sql.files.ignoreMissingFiles`
instead of the core config
`spark.files.ignoreCorruptFiles`/`spark.files.ignoreMissingFiles`.
+- Since Spark 4.0, when reading SQL tables hits
`org.apache.hadoop.security.AccessControlException` and
`org.apache.hadoop.hdfs.BlockMissingException`, the exception will be thrown
and fail the task, even if `spark.sql.files.ignoreCorruptFiles` is set to
`true`.
- Since Spark 4.0, `spark.sql.hive.metastore` drops the support of Hive prior
to 2.0.0 as they require JDK 8 that Spark does not support anymore. Users
should migrate to higher versions.
- Since Spark 4.0, `spark.sql.parquet.compression.codec` drops the support of
codec name `lz4raw`, please use `lz4_raw` instead.
- Since Spark 4.0, when overflowing during casting timestamp to byte/short/int
under non-ansi mode, Spark will return null instead a wrapping value.
@@ -63,6 +64,10 @@ license: |
- Since Spark 4.0, The Storage-Partitioned Join feature flag
`spark.sql.sources.v2.bucketing.pushPartValues.enabled` is set to `true`. To
restore the previous behavior, set
`spark.sql.sources.v2.bucketing.pushPartValues.enabled` to `false`.
- Since Spark 4.0, the `sentences` function uses `Locale(language)` instead of
`Locale.US` when `language` parameter is not `NULL` and `country` parameter is
`NULL`.
+## Upgrading from Spark SQL 3.5.3 to 3.5.4
+
+- Since Spark 3.5.4, when reading SQL tables hits
`org.apache.hadoop.security.AccessControlException` and
`org.apache.hadoop.hdfs.BlockMissingException`, the exception will be thrown
and fail the task, even if `spark.sql.files.ignoreCorruptFiles` is set to
`true`.
+
## Upgrading from Spark SQL 3.5.1 to 3.5.2
- Since 3.5.2, MySQL JDBC datasource will read TINYINT UNSIGNED as ShortType,
while in 3.5.1, it was wrongly read as ByteType.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]