(spark) branch branch-3.3 updated: [MINOR][DOCS] Fix the example value in the docs
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 662edf53d39 [MINOR][DOCS] Fix the example value in the docs 662edf53d39 is described below commit 662edf53d394541e9bfd6153576ceed0fed50cfa Author: longfei.jiang AuthorDate: Sat Nov 11 13:49:18 2023 +0800 [MINOR][DOCS] Fix the example value in the docs ### What changes were proposed in this pull request? fix the example value ### Why are the changes needed? for doc ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Just example value in the docs, no need to test. ### Was this patch authored or co-authored using generative AI tooling? No Closes #43750 from jlfsdtc/fix_typo_in_doc. Authored-by: longfei.jiang Signed-off-by: Kent Yao (cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0) Signed-off-by: Kent Yao --- docs/sql-ref-datetime-pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index 4b02cdad361..6a4a1b67348 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date and timestamp parsing |**a**|am-pm-of-day|am-pm|PM| |**h**|clock-hour-of-am-pm (1-12)|number(2)|12| |**K**|hour-of-am-pm (0-11)|number(2)|0| -|**k**|clock-hour-of-day (1-24)|number(2)|0| +|**k**|clock-hour-of-day (1-24)|number(2)|1| |**H**|hour-of-day (0-23)|number(2)|0| |**m**|minute-of-hour|number(2)|30| |**s**|second-of-minute|number(2)|55| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.4 updated: [MINOR][DOCS] Fix the example value in the docs
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.4 by this push: new 92bea64b507 [MINOR][DOCS] Fix the example value in the docs 92bea64b507 is described below commit 92bea64b507f2801759d52ade4cdbf6c930124c5 Author: longfei.jiang AuthorDate: Sat Nov 11 13:49:18 2023 +0800 [MINOR][DOCS] Fix the example value in the docs ### What changes were proposed in this pull request? fix the example value ### Why are the changes needed? for doc ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Just example value in the docs, no need to test. ### Was this patch authored or co-authored using generative AI tooling? No Closes #43750 from jlfsdtc/fix_typo_in_doc. Authored-by: longfei.jiang Signed-off-by: Kent Yao (cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0) Signed-off-by: Kent Yao --- docs/sql-ref-datetime-pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index 5e28a18acef..e5d5388f262 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date and timestamp parsing |**a**|am-pm-of-day|am-pm|PM| |**h**|clock-hour-of-am-pm (1-12)|number(2)|12| |**K**|hour-of-am-pm (0-11)|number(2)|0| -|**k**|clock-hour-of-day (1-24)|number(2)|0| +|**k**|clock-hour-of-day (1-24)|number(2)|1| |**H**|hour-of-day (0-23)|number(2)|0| |**m**|minute-of-hour|number(2)|30| |**s**|second-of-minute|number(2)|55| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.5 updated: [MINOR][DOCS] Fix the example value in the docs
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 19d225bf3f5 [MINOR][DOCS] Fix the example value in the docs 19d225bf3f5 is described below commit 19d225bf3f56d392ebb4e7727bd30109b1b75bf5 Author: longfei.jiang AuthorDate: Sat Nov 11 13:49:18 2023 +0800 [MINOR][DOCS] Fix the example value in the docs ### What changes were proposed in this pull request? fix the example value ### Why are the changes needed? for doc ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? Just example value in the docs, no need to test. ### Was this patch authored or co-authored using generative AI tooling? No Closes #43750 from jlfsdtc/fix_typo_in_doc. Authored-by: longfei.jiang Signed-off-by: Kent Yao (cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0) Signed-off-by: Kent Yao --- docs/sql-ref-datetime-pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md index 5e28a18acef..e5d5388f262 100644 --- a/docs/sql-ref-datetime-pattern.md +++ b/docs/sql-ref-datetime-pattern.md @@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date and timestamp parsing |**a**|am-pm-of-day|am-pm|PM| |**h**|clock-hour-of-am-pm (1-12)|number(2)|12| |**K**|hour-of-am-pm (0-11)|number(2)|0| -|**k**|clock-hour-of-day (1-24)|number(2)|0| +|**k**|clock-hour-of-day (1-24)|number(2)|1| |**H**|hour-of-day (0-23)|number(2)|0| |**m**|minute-of-hour|number(2)|30| |**s**|second-of-minute|number(2)|55| - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (0a791993be7 -> b501a223bfc)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 0a791993be7 [SPARK-45686][INFRA][CORE][SQL][SS][CONNECT][MLLIB][DSTREAM][AVRO][ML][K8S][YARN][PYTHON][R][UI][GRAPHX][PROTOBUF][TESTS][EXAMPLES] Explicitly convert `Array` to `Seq` when function input is defined as `Seq` to avoid compilation warnings related to `class LowPriorityImplicits2 is deprecated` add b501a223bfc [MINOR][DOCS] Fix the example value in the docs No new revisions were added by this update. Summary of changes: docs/sql-ref-datetime-pattern.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.4 updated: [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.4 by this push: new 3978bf4528c [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6 3978bf4528c is described below commit 3978bf4528c6ae58944d9fb3f8776ab570eeb7c8 Author: Dongjoon Hyun AuthorDate: Fri Nov 10 08:56:02 2023 -0800 [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6 ### What changes were proposed in this pull request? This PR aims to upgrade ORC to 1.8.6 for Apache Spark 3.4.2. ### Why are the changes needed? To bring the latest maintenance releases as a part of Apache Spark 3.4.2 release - https://github.com/apache/orc/releases/tag/v1.8.6 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43755 from dongjoon-hyun/SPARK-45884. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- dev/deps/spark-deps-hadoop-2-hive-2.3 | 6 +++--- dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++--- pom.xml | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 b/dev/deps/spark-deps-hadoop-2-hive-2.3 index c562b0b7e16..691c83632b3 100644 --- a/dev/deps/spark-deps-hadoop-2-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-2-hive-2.3 @@ -222,9 +222,9 @@ objenesis/3.2//objenesis-3.2.jar okhttp/3.12.12//okhttp-3.12.12.jar okio/1.15.0//okio-1.15.0.jar opencsv/2.3//opencsv-2.3.jar -orc-core/1.8.5/shaded-protobuf/orc-core-1.8.5-shaded-protobuf.jar -orc-mapreduce/1.8.5/shaded-protobuf/orc-mapreduce-1.8.5-shaded-protobuf.jar -orc-shims/1.8.5//orc-shims-1.8.5.jar +orc-core/1.8.6/shaded-protobuf/orc-core-1.8.6-shaded-protobuf.jar +orc-mapreduce/1.8.6/shaded-protobuf/orc-mapreduce-1.8.6-shaded-protobuf.jar +orc-shims/1.8.6//orc-shims-1.8.6.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index bcfc8c92b10..4d94cb5c699 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -209,9 +209,9 @@ opencsv/2.3//opencsv-2.3.jar opentracing-api/0.33.0//opentracing-api-0.33.0.jar opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar opentracing-util/0.33.0//opentracing-util-0.33.0.jar -orc-core/1.8.5/shaded-protobuf/orc-core-1.8.5-shaded-protobuf.jar -orc-mapreduce/1.8.5/shaded-protobuf/orc-mapreduce-1.8.5-shaded-protobuf.jar -orc-shims/1.8.5//orc-shims-1.8.5.jar +orc-core/1.8.6/shaded-protobuf/orc-core-1.8.6-shaded-protobuf.jar +orc-mapreduce/1.8.6/shaded-protobuf/orc-mapreduce-1.8.6-shaded-protobuf.jar +orc-shims/1.8.6//orc-shims-1.8.6.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/pom.xml b/pom.xml index 6fa84df3d25..706a11d43b3 100644 --- a/pom.xml +++ b/pom.xml @@ -141,7 +141,7 @@ 10.14.2.0 1.12.3 -1.8.5 +1.8.6 shaded-protobuf 9.4.50.v20221201 4.0.3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.3 updated: [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 6780c7857dc [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10 6780c7857dc is described below commit 6780c7857dc9a0333fc12e8a25acff84bfacf2af Author: Dongjoon Hyun AuthorDate: Fri Nov 10 08:54:38 2023 -0800 [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10 ### What changes were proposed in this pull request? This PR aims to upgrade ORC to 1.7.10 for Apache Spark 3.3.4 ### Why are the changes needed? To bring the latest bug fixes. - https://github.com/apache/orc/releases/tag/v1.7.9 - https://github.com/apache/orc/releases/tag/v1.7.10 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43756 from dongjoon-hyun/SPARK-45885. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- dev/deps/spark-deps-hadoop-2-hive-2.3 | 6 +++--- dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++--- pom.xml | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 b/dev/deps/spark-deps-hadoop-2-hive-2.3 index 16ac8dc27c5..e5523d3350b 100644 --- a/dev/deps/spark-deps-hadoop-2-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-2-hive-2.3 @@ -219,9 +219,9 @@ objenesis/3.2//objenesis-3.2.jar okhttp/3.12.12//okhttp-3.12.12.jar okio/1.14.0//okio-1.14.0.jar opencsv/2.3//opencsv-2.3.jar -orc-core/1.7.8//orc-core-1.7.8.jar -orc-mapreduce/1.7.8//orc-mapreduce-1.7.8.jar -orc-shims/1.7.8//orc-shims-1.7.8.jar +orc-core/1.7.10//orc-core-1.7.10.jar +orc-mapreduce/1.7.10//orc-mapreduce-1.7.10.jar +orc-shims/1.7.10//orc-shims-1.7.10.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index 7c2dfb3faa0..86145c3051d 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -208,9 +208,9 @@ opencsv/2.3//opencsv-2.3.jar opentracing-api/0.33.0//opentracing-api-0.33.0.jar opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar opentracing-util/0.33.0//opentracing-util-0.33.0.jar -orc-core/1.7.8//orc-core-1.7.8.jar -orc-mapreduce/1.7.8//orc-mapreduce-1.7.8.jar -orc-shims/1.7.8//orc-shims-1.7.8.jar +orc-core/1.7.10//orc-core-1.7.10.jar +orc-mapreduce/1.7.10//orc-mapreduce-1.7.10.jar +orc-shims/1.7.10//orc-shims-1.7.10.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/pom.xml b/pom.xml index 4aab8b5544e..b70ae68b1e4 100644 --- a/pom.xml +++ b/pom.xml @@ -133,7 +133,7 @@ 10.14.2.0 1.12.2 -1.7.8 +1.7.10 9.4.48.v20220622 4.0.3 0.10.0 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.5 updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 68b531dd2b4 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2 68b531dd2b4 is described below commit 68b531dd2b485fa2203d6a2bd2de90afc97a13bb Author: Dongjoon Hyun AuthorDate: Fri Nov 10 07:50:17 2023 -0800 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2 ### What changes were proposed in this pull request? This PR aims to upgrade ORC to 1.9.2 for Apache Spark 4.0.0 and 3.5.1. ### Why are the changes needed? To bring the latest bug fixes. - https://github.com/apache/orc/releases/tag/v1.9.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43754 from dongjoon-hyun/SPARK-45883. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit 917947e62e1e67f49a83c1ffb0833b61f0c48eb6) Signed-off-by: Dongjoon Hyun --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++--- pom.xml | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index 1d02f8dba56..9ab51dfa011 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -212,9 +212,9 @@ opencsv/2.3//opencsv-2.3.jar opentracing-api/0.33.0//opentracing-api-0.33.0.jar opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar opentracing-util/0.33.0//opentracing-util-0.33.0.jar -orc-core/1.9.1/shaded-protobuf/orc-core-1.9.1-shaded-protobuf.jar -orc-mapreduce/1.9.1/shaded-protobuf/orc-mapreduce-1.9.1-shaded-protobuf.jar -orc-shims/1.9.1//orc-shims-1.9.1.jar +orc-core/1.9.2/shaded-protobuf/orc-core-1.9.2-shaded-protobuf.jar +orc-mapreduce/1.9.2/shaded-protobuf/orc-mapreduce-1.9.2-shaded-protobuf.jar +orc-shims/1.9.2//orc-shims-1.9.2.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/pom.xml b/pom.xml index be8400c33bf..14e0ab3e0f6 100644 --- a/pom.xml +++ b/pom.xml @@ -141,7 +141,7 @@ 10.14.2.0 1.13.1 -1.9.1 +1.9.2 shaded-protobuf 9.4.52.v20230823 4.0.3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 917947e62e1 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2 917947e62e1 is described below commit 917947e62e1e67f49a83c1ffb0833b61f0c48eb6 Author: Dongjoon Hyun AuthorDate: Fri Nov 10 07:50:17 2023 -0800 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2 ### What changes were proposed in this pull request? This PR aims to upgrade ORC to 1.9.2 for Apache Spark 4.0.0 and 3.5.1. ### Why are the changes needed? To bring the latest bug fixes. - https://github.com/apache/orc/releases/tag/v1.9.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43754 from dongjoon-hyun/SPARK-45883. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++--- pom.xml | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index b7d6bdbfd12..0a952aa6ee8 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -218,9 +218,9 @@ opencsv/2.3//opencsv-2.3.jar opentracing-api/0.33.0//opentracing-api-0.33.0.jar opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar opentracing-util/0.33.0//opentracing-util-0.33.0.jar -orc-core/1.9.1/shaded-protobuf/orc-core-1.9.1-shaded-protobuf.jar -orc-mapreduce/1.9.1/shaded-protobuf/orc-mapreduce-1.9.1-shaded-protobuf.jar -orc-shims/1.9.1//orc-shims-1.9.1.jar +orc-core/1.9.2/shaded-protobuf/orc-core-1.9.2-shaded-protobuf.jar +orc-mapreduce/1.9.2/shaded-protobuf/orc-mapreduce-1.9.2-shaded-protobuf.jar +orc-shims/1.9.2//orc-shims-1.9.2.jar oro/2.0.8//oro-2.0.8.jar osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar paranamer/2.8//paranamer-2.8.jar diff --git a/pom.xml b/pom.xml index 71d8ffcc1c9..14754c0bcaa 100644 --- a/pom.xml +++ b/pom.xml @@ -141,7 +141,7 @@ 10.14.2.0 1.13.1 -1.9.1 +1.9.2 shaded-protobuf 9.4.53.v20231009 4.0.3 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix `Passing an explicit array value to a Scala varargs method is deprecated`
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 605aa0c299c [SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix `Passing an explicit array value to a Scala varargs method is deprecated` 605aa0c299c is described below commit 605aa0c299c1d88f8a31ba888ac8e6b6203be6c5 Author: Tengfei Huang AuthorDate: Fri Nov 10 08:10:20 2023 -0600 [SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix `Passing an explicit array value to a Scala varargs method is deprecated` ### What changes were proposed in this pull request? Fix the deprecated behavior below: `Passing an explicit array value to a Scala varargs method is deprecated (since 2.13.0) and will result in a defensive copy; Use the more efficient non-copying ArraySeq.unsafeWrapArray or an explicit toIndexedSeq call` For all the use cases, we don't need to make a copy of the array. Explicitly use `ArraySeq.unsafeWrapArray` to do the conversion. ### Why are the changes needed? Eliminate compile warnings and no longer use deprecated scala APIs. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GA. Fixed all the warning with build: `mvn clean package -DskipTests -Pspark-ganglia-lgpl -Pkinesis-asl -Pdocker-integration-tests -Pyarn -Pkubernetes -Pkubernetes-integration-tests -Phive-thriftserver -Phadoop-cloud` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43642 from ivoson/SPARK-45687. Authored-by: Tengfei Huang Signed-off-by: Sean Owen --- .../scala/org/apache/spark/sql/KeyValueGroupedDataset.scala | 9 ++--- .../test/scala/org/apache/spark/sql/ColumnTestSuite.scala| 3 ++- .../apache/spark/sql/UserDefinedFunctionE2ETestSuite.scala | 5 - .../spark/sql/connect/planner/SparkConnectPlanner.scala | 3 ++- .../main/scala/org/apache/spark/api/python/PythonRDD.scala | 3 ++- core/src/main/scala/org/apache/spark/executor/Executor.scala | 3 ++- core/src/main/scala/org/apache/spark/rdd/RDD.scala | 3 ++- .../scala/org/apache/spark/examples/graphx/Analytics.scala | 4 ++-- .../scala/org/apache/spark/ml/classification/OneVsRest.scala | 3 ++- .../scala/org/apache/spark/ml/feature/FeatureHasher.scala| 4 +++- .../src/main/scala/org/apache/spark/ml/feature/Imputer.scala | 8 +--- .../main/scala/org/apache/spark/ml/feature/Interaction.scala | 4 +++- .../main/scala/org/apache/spark/ml/feature/RFormula.scala| 6 -- .../scala/org/apache/spark/ml/feature/VectorAssembler.scala | 5 +++-- mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala | 3 ++- .../src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala | 3 ++- .../src/main/scala/org/apache/spark/ml/r/KSTestWrapper.scala | 3 ++- .../apache/spark/ml/regression/DecisionTreeRegressor.scala | 3 ++- .../src/main/scala/org/apache/spark/ml/tree/treeModels.scala | 3 ++- .../src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 12 .../scala/org/apache/spark/ml/feature/ImputerSuite.scala | 12 .../apache/spark/ml/source/image/ImageFileFormatSuite.scala | 3 ++- .../apache/spark/ml/stat/KolmogorovSmirnovTestSuite.scala| 3 ++- mllib/src/test/scala/org/apache/spark/ml/util/MLTest.scala | 6 -- .../deploy/k8s/features/DriverCommandFeatureStepSuite.scala | 2 +- .../apache/spark/sql/catalyst/expressions/generators.scala | 8 ++-- .../sql/catalyst/expressions/UnsafeRowConverterSuite.scala | 4 +++- .../scala/org/apache/spark/sql/DataFrameStatFunctions.scala | 3 ++- .../scala/org/apache/spark/sql/KeyValueGroupedDataset.scala | 8 ++-- .../spark/sql/execution/datasources/jdbc/JDBCRDD.scala | 2 +- .../org/apache/spark/sql/execution/stat/StatFunctions.scala | 3 ++- .../apache/spark/sql/execution/streaming/OffsetSeqLog.scala | 3 ++- .../streaming/continuous/ContinuousRateStreamSource.scala| 3 ++- .../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala | 3 ++- .../src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 6 -- .../src/test/scala/org/apache/spark/sql/GenTPCDSData.scala | 3 ++- .../test/scala/org/apache/spark/sql/ParametersSuite.scala| 9 + .../spark/sql/connector/SimpleWritableDataSource.scala | 4 +++- .../sql/execution/datasources/FileMetadataStructSuite.scala | 3 ++- .../spark/sql/execution/datasources/csv/CSVBenchmark.scala | 7 --- .../scala/org/apache/spark/sql/streaming/StreamSuite.scala | 2 +- .../org/apache/spark/sql/streaming/StreamingQuerySuite.scala | 3 ++- .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala| 3 ++-
(spark) branch master updated: [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations
This is an automated email from the ASF dual-hosted git repository. beliefer pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6851cb96ec6 [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations 6851cb96ec6 is described below commit 6851cb96ec651b25a8103f7681e8528ff7d625ff Author: Jiaan Geng AuthorDate: Fri Nov 10 22:00:51 2023 +0800 [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/43614 let unreferenced `CTE` checked by `CheckAnalysis0`. This PR follows up https://github.com/apache/spark/pull/43614 to simplify the code for check unreferenced CTE relations. ### Why are the changes needed? Simplify the code for check unreferenced CTE relations ### Does this PR introduce _any_ user-facing change? 'No'. ### How was this patch tested? Exists test cases. ### Was this patch authored or co-authored using generative AI tooling? 'No'. Closes #43727 from beliefer/SPARK-45752_followup. Authored-by: Jiaan Geng Signed-off-by: Jiaan Geng --- .../spark/sql/catalyst/analysis/CheckAnalysis.scala| 12 .../scala/org/apache/spark/sql/CTEInlineSuite.scala| 18 -- 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala index 29d60ae0f41..f9010d47508 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala @@ -167,25 +167,21 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB val inlineCTE = InlineCTE(alwaysInline = true) val cteMap = mutable.HashMap.empty[Long, (CTERelationDef, Int, mutable.Map[Long, Int])] inlineCTE.buildCTEMap(plan, cteMap) -cteMap.values.foreach { case (relation, _, _) => +val visited: mutable.Map[Long, Boolean] = mutable.Map.empty.withDefaultValue(false) +cteMap.foreach { case (cteId, (relation, refCount, _)) => // If a CTE relation is never used, it will disappear after inline. Here we explicitly check // analysis for it, to make sure the entire query plan is valid. try { // If a CTE relation ref count is 0, the other CTE relations that reference it // should also be checked by checkAnalysis0. This code will also guarantee the leaf // relations that do not reference any others are checked first. -val visited: mutable.Map[Long, Boolean] = mutable.Map.empty.withDefaultValue(false) -cteMap.foreach { case (cteId, _) => - val (_, refCount, _) = cteMap(cteId) - if (refCount == 0) { -checkUnreferencedCTERelations(cteMap, visited, cteId) - } +if (refCount == 0) { + checkUnreferencedCTERelations(cteMap, visited, cteId) } } catch { case e: AnalysisException => throw new ExtendedAnalysisException(e, relation.child) } - } // Inline all CTEs in the plan to help check query plan structures in subqueries. var inlinedPlan: Option[LogicalPlan] = None diff --git a/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala index 055c04992c0..a06b50d175f 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala @@ -683,11 +683,25 @@ abstract class CTEInlineSuiteBase val e = intercept[AnalysisException](sql( s""" |with -|a as (select * from non_exist), +|a as (select * from tab_non_exists), |b as (select * from a) |select 2 |""".stripMargin)) -checkErrorTableNotFound(e, "`non_exist`", ExpectedContext("non_exist", 26, 34)) +checkErrorTableNotFound(e, "`tab_non_exists`", ExpectedContext("tab_non_exists", 26, 39)) + +withTable("tab_exists") { + spark.sql("CREATE TABLE tab_exists(id INT) using parquet") + val e = intercept[AnalysisException](sql( +s""" + |with + |a as (select * from tab_exists), + |b as (select * from a), + |c as (select * from tab_non_exists), + |d as (select * from c) + |select 2 + |""".stripMargin)) + checkErrorTableNotFound(e, "`tab_non_exists`", ExpectedContext("tab_non_exists", 83, 96)) +} } } - To unsubscribe, e-mail:
(spark) branch branch-3.4 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.4 by this push: new 61868525785 [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite 61868525785 is described below commit 618685257859d5a34b50e22905c73f639bd2cc30 Author: Kent Yao AuthorDate: Fri Nov 10 21:09:43 2023 +0800 [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite ### What changes were proposed in this pull request? This PR changes the ArrayBuffer for logs to immutable for reading to prevent ConcurrentModificationException which hides the actual cause of failure ### Why are the changes needed? ```scala [info] - SPARK-29022 Commands using SerDe provided in ADD JAR sql *** FAILED *** (11 seconds, 105 milliseconds) [info] java.util.ConcurrentModificationException: mutation occurred during iteration [info] at scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43) [info] at scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47) [info] at scala.collection.IterableOnceOps.addString(IterableOnce.scala:1247) [info] at scala.collection.IterableOnceOps.addString$(IterableOnce.scala:1241) [info] at scala.collection.AbstractIterable.addString(Iterable.scala:933) [info] at org.apache.spark.sql.hive.thriftserver.CliSuite.runCliWithin(CliSuite.scala:205) [info] at org.apache.spark.sql.hive.thriftserver.CliSuite.$anonfun$new$20(CliSuite.scala:501) ``` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #43749 from yaooqinn/SPARK-45878. Authored-by: Kent Yao Signed-off-by: Kent Yao (cherry picked from commit b347237735094e9092f4100583ed1d6f3eacf1f6) Signed-off-by: Kent Yao --- .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala index 1b91442c228..d3721cf68ab 100644 --- a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala +++ b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala @@ -198,7 +198,7 @@ class CliSuite extends SparkFunSuite { ThreadUtils.awaitResult(foundAllExpectedAnswers.future, timeoutForQuery) log.info("Found all expected output.") } catch { case cause: Throwable => - val message = + val message = lock.synchronized { s""" |=== |CliSuite failure output @@ -212,6 +212,7 @@ class CliSuite extends SparkFunSuite { |End CliSuite failure output |=== """.stripMargin + } logError(message, cause) fail(message, cause) } finally { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (49ca6aa6cb7 -> b3472377350)
This is an automated email from the ASF dual-hosted git repository. yao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 49ca6aa6cb7 [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor add b3472377350 [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite No new revisions were added by this update. Summary of changes: .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.5 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 0b68e1700f6 [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite 0b68e1700f6 is described below commit 0b68e1700f60ad1a32f066c10a0f76bea893b7ce Author: Kent Yao AuthorDate: Fri Nov 10 21:09:43 2023 +0800 [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite ### What changes were proposed in this pull request? This PR changes the ArrayBuffer for logs to immutable for reading to prevent ConcurrentModificationException which hides the actual cause of failure ### Why are the changes needed? ```scala [info] - SPARK-29022 Commands using SerDe provided in ADD JAR sql *** FAILED *** (11 seconds, 105 milliseconds) [info] java.util.ConcurrentModificationException: mutation occurred during iteration [info] at scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43) [info] at scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47) [info] at scala.collection.IterableOnceOps.addString(IterableOnce.scala:1247) [info] at scala.collection.IterableOnceOps.addString$(IterableOnce.scala:1241) [info] at scala.collection.AbstractIterable.addString(Iterable.scala:933) [info] at org.apache.spark.sql.hive.thriftserver.CliSuite.runCliWithin(CliSuite.scala:205) [info] at org.apache.spark.sql.hive.thriftserver.CliSuite.$anonfun$new$20(CliSuite.scala:501) ``` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #43749 from yaooqinn/SPARK-45878. Authored-by: Kent Yao Signed-off-by: Kent Yao (cherry picked from commit b347237735094e9092f4100583ed1d6f3eacf1f6) Signed-off-by: Kent Yao --- .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala index 343b32e6227..38dcd1d8b00 100644 --- a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala +++ b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala @@ -193,7 +193,7 @@ class CliSuite extends SparkFunSuite { ThreadUtils.awaitResult(foundAllExpectedAnswers.future, timeoutForQuery) log.info("Found all expected output.") } catch { case cause: Throwable => - val message = + val message = lock.synchronized { s""" |=== |CliSuite failure output @@ -207,6 +207,7 @@ class CliSuite extends SparkFunSuite { |End CliSuite failure output |=== """.stripMargin + } logError(message, cause) fail(message, cause) } finally { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 49ca6aa6cb7 [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor 49ca6aa6cb7 is described below commit 49ca6aa6cb75b931d1c38dcffb4cd3dd63b0a2f3 Author: Max Gekk AuthorDate: Fri Nov 10 12:17:09 2023 +0300 [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor ### What changes were proposed in this pull request? In the PR, I propose to use the `cause` argument in the `CannotReplaceMissingTableException` constructor. ### Why are the changes needed? To improve user experience with Spark SQL while troubleshooting issues. Currently, users don't see where the exception come from. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43738 from MaxGekk/fix-missed-cause. Authored-by: Max Gekk Signed-off-by: Max Gekk --- .../sql/catalyst/analysis/CannotReplaceMissingTableException.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala index 910bb9d3749..032cdca12c0 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala @@ -28,4 +28,5 @@ class CannotReplaceMissingTableException( extends AnalysisException( errorClass = "TABLE_OR_VIEW_NOT_FOUND", messageParameters = Map("relationName" --> quoteNameParts(tableIdentifier.namespace :+ tableIdentifier.name))) +-> quoteNameParts(tableIdentifier.namespace :+ tableIdentifier.name)), + cause = cause) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (f5b1b8306cf -> bd526986a5f)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from f5b1b8306cf [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option add bd526986a5f [SPARK-45837][CONNECT] Improve logging information in handling retries No new revisions were added by this update. Summary of changes: .../sql/connect/client/ExecutePlanResponseReattachableIterator.scala | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f5b1b8306cf [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option f5b1b8306cf is described below commit f5b1b8306cf13218f5ff79944aaa9c0b4e74fda4 Author: Sandip Agarwala <131817656+sandip...@users.noreply.github.com> AuthorDate: Fri Nov 10 17:44:39 2023 +0900 [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option ### What changes were proposed in this pull request? rowTag option is required for reading XML files. This PR adds a SQL error class for missing rowTag option. ### Why are the changes needed? rowTag option is required for reading XML files. This PR adds a SQL error class for missing rowTag option. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Updated the unit test to check for error message. ### Was this patch authored or co-authored using generative AI tooling? No Closes #43710 from sandip-db/xml-rowTagRequiredError. Authored-by: Sandip Agarwala <131817656+sandip...@users.noreply.github.com> Signed-off-by: Hyukjin Kwon --- common/utils/src/main/resources/error/error-classes.json | 6 ++ docs/sql-error-conditions.md | 6 ++ .../scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala | 8 ++-- .../org/apache/spark/sql/errors/QueryCompilationErrors.scala | 7 +++ .../apache/spark/sql/execution/datasources/xml/XmlSuite.scala | 11 --- 5 files changed, 33 insertions(+), 5 deletions(-) diff --git a/common/utils/src/main/resources/error/error-classes.json b/common/utils/src/main/resources/error/error-classes.json index 26f6c0240af..3b7a3a6006e 100644 --- a/common/utils/src/main/resources/error/error-classes.json +++ b/common/utils/src/main/resources/error/error-classes.json @@ -3911,6 +3911,12 @@ }, "sqlState" : "42605" }, + "XML_ROW_TAG_MISSING" : { +"message" : [ + " option is required for reading files in XML format." +], +"sqlState" : "42000" + }, "_LEGACY_ERROR_TEMP_0001" : { "message" : [ "Invalid InsertIntoContext." diff --git a/docs/sql-error-conditions.md b/docs/sql-error-conditions.md index 2cb433b19fa..a811019e0a5 100644 --- a/docs/sql-error-conditions.md +++ b/docs/sql-error-conditions.md @@ -2369,3 +2369,9 @@ The operation `` requires a ``. But `` is a The `` requires `` parameters but the actual number is ``. For more details see [WRONG_NUM_ARGS](sql-error-conditions-wrong-num-args-error-class.html) + +### XML_ROW_TAG_MISSING + +[SQLSTATE: 42000](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation) + +`` option is required for reading files in XML format. diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala index aac6eec21c6..8f6cdbf360e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala @@ -24,7 +24,7 @@ import javax.xml.stream.XMLInputFactory import org.apache.spark.internal.Logging import org.apache.spark.sql.catalyst.{DataSourceOptions, FileSourceOptions} import org.apache.spark.sql.catalyst.util.{CaseInsensitiveMap, CompressionCodecs, DateFormatter, DateTimeUtils, ParseMode, PermissiveMode} -import org.apache.spark.sql.errors.QueryExecutionErrors +import org.apache.spark.sql.errors.{QueryCompilationErrors, QueryExecutionErrors} import org.apache.spark.sql.internal.{LegacyBehaviorPolicy, SQLConf} /** @@ -66,7 +66,11 @@ private[sql] class XmlOptions( val compressionCodec = parameters.get(COMPRESSION).map(CompressionCodecs.getCodecClassName) val rowTagOpt = parameters.get(XmlOptions.ROW_TAG).map(_.trim) - require(!rowTagRequired || rowTagOpt.isDefined, s"'${XmlOptions.ROW_TAG}' option is required.") + + if (rowTagRequired && rowTagOpt.isEmpty) { +throw QueryCompilationErrors.xmlRowTagRequiredError(XmlOptions.ROW_TAG) + } + val rowTag = rowTagOpt.getOrElse(XmlOptions.DEFAULT_ROW_TAG) require(rowTag.nonEmpty, s"'$ROW_TAG' option should not be an empty string.") require(!rowTag.startsWith("<") && !rowTag.endsWith(">"), diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala index 0c5dcb1ead0..e772b3497ac 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala @@ -3817,4 +3817,11
(spark) branch master updated: [SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during logging
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9bac48d4bd6 [SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during logging 9bac48d4bd6 is described below commit 9bac48d4bd68d4f0d54c53c29a27b1f6e02c5f61 Author: Martin Grund AuthorDate: Fri Nov 10 17:12:25 2023 +0900 [SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during logging ### What changes were proposed in this pull request? The Python client for Spark connect logs the text representation of the proto message. However, for deeply nested objects this can lead to a Python recursion error even before the maximum nested recursion limit of the GRPC message is reached. This patch fixes this issue by explicitly catching the recursion error during text conversion. ### Why are the changes needed? Stability ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT ### Was this patch authored or co-authored using generative AI tooling? No Closes #43732 from grundprinzip/SPARK-45852. Authored-by: Martin Grund Signed-off-by: Hyukjin Kwon --- python/pyspark/sql/connect/client/core.py | 5 - python/pyspark/sql/tests/connect/test_connect_basic.py | 13 + 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/python/pyspark/sql/connect/client/core.py b/python/pyspark/sql/connect/client/core.py index 965c4107cac..7eafcc501f5 100644 --- a/python/pyspark/sql/connect/client/core.py +++ b/python/pyspark/sql/connect/client/core.py @@ -935,7 +935,10 @@ class SparkConnectClient(object): --- Single line string of the serialized proto message. """ -return text_format.MessageToString(p, as_one_line=True) +try: +return text_format.MessageToString(p, as_one_line=True) +except RecursionError: +return "" def schema(self, plan: pb2.Plan) -> StructType: """ diff --git a/python/pyspark/sql/tests/connect/test_connect_basic.py b/python/pyspark/sql/tests/connect/test_connect_basic.py index daf6772e52b..7a224d68219 100755 --- a/python/pyspark/sql/tests/connect/test_connect_basic.py +++ b/python/pyspark/sql/tests/connect/test_connect_basic.py @@ -159,6 +159,19 @@ class SparkConnectSQLTestCase(ReusedConnectTestCase, SQLTestUtils, PandasOnSpark class SparkConnectBasicTests(SparkConnectSQLTestCase): +def test_recursion_handling_for_plan_logging(self): +"""SPARK-45852 - Test that we can handle recursion in plan logging.""" +cdf = self.connect.range(1) +for x in range(400): +cdf = cdf.withColumn(f"col_{x}", CF.lit(x)) + +# Calling schema will trigger logging the message that will in turn trigger the message +# conversion into protobuf that will then trigger the recursion error. +self.assertIsNotNone(cdf.schema) + +result = self.connect._client._proto_to_string(cdf._plan.to_proto(self.connect._client)) +self.assertIn("recursion", result) + def test_df_getattr_behavior(self): cdf = self.connect.range(10) sdf = self.spark.range(10) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org