date:20231110

(spark) branch branch-3.3 updated: [MINOR][DOCS] Fix the example value in the docs

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 662edf53d39 [MINOR][DOCS] Fix the example value in the docs
662edf53d39 is described below

commit 662edf53d394541e9bfd6153576ceed0fed50cfa
Author: longfei.jiang 
AuthorDate: Sat Nov 11 13:49:18 2023 +0800

[MINOR][DOCS] Fix the example value in the docs

### What changes were proposed in this pull request?

fix the example value

### Why are the changes needed?

for doc

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

Just example value in the docs, no need to test.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43750 from jlfsdtc/fix_typo_in_doc.

Authored-by: longfei.jiang 
Signed-off-by: Kent Yao 
(cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0)
Signed-off-by: Kent Yao 
---
 docs/sql-ref-datetime-pattern.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md
index 4b02cdad361..6a4a1b67348 100644
--- a/docs/sql-ref-datetime-pattern.md
+++ b/docs/sql-ref-datetime-pattern.md
@@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date 
and timestamp parsing
 |**a**|am-pm-of-day|am-pm|PM|
 |**h**|clock-hour-of-am-pm (1-12)|number(2)|12|
 |**K**|hour-of-am-pm (0-11)|number(2)|0|
-|**k**|clock-hour-of-day (1-24)|number(2)|0|
+|**k**|clock-hour-of-day (1-24)|number(2)|1|
 |**H**|hour-of-day (0-23)|number(2)|0|
 |**m**|minute-of-hour|number(2)|30|
 |**s**|second-of-minute|number(2)|55|


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.4 updated: [MINOR][DOCS] Fix the example value in the docs

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 92bea64b507 [MINOR][DOCS] Fix the example value in the docs
92bea64b507 is described below

commit 92bea64b507f2801759d52ade4cdbf6c930124c5
Author: longfei.jiang 
AuthorDate: Sat Nov 11 13:49:18 2023 +0800

[MINOR][DOCS] Fix the example value in the docs

### What changes were proposed in this pull request?

fix the example value

### Why are the changes needed?

for doc

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

Just example value in the docs, no need to test.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43750 from jlfsdtc/fix_typo_in_doc.

Authored-by: longfei.jiang 
Signed-off-by: Kent Yao 
(cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0)
Signed-off-by: Kent Yao 
---
 docs/sql-ref-datetime-pattern.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md
index 5e28a18acef..e5d5388f262 100644
--- a/docs/sql-ref-datetime-pattern.md
+++ b/docs/sql-ref-datetime-pattern.md
@@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date 
and timestamp parsing
 |**a**|am-pm-of-day|am-pm|PM|
 |**h**|clock-hour-of-am-pm (1-12)|number(2)|12|
 |**K**|hour-of-am-pm (0-11)|number(2)|0|
-|**k**|clock-hour-of-day (1-24)|number(2)|0|
+|**k**|clock-hour-of-day (1-24)|number(2)|1|
 |**H**|hour-of-day (0-23)|number(2)|0|
 |**m**|minute-of-hour|number(2)|30|
 |**s**|second-of-minute|number(2)|55|


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.5 updated: [MINOR][DOCS] Fix the example value in the docs

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 19d225bf3f5 [MINOR][DOCS] Fix the example value in the docs
19d225bf3f5 is described below

commit 19d225bf3f56d392ebb4e7727bd30109b1b75bf5
Author: longfei.jiang 
AuthorDate: Sat Nov 11 13:49:18 2023 +0800

[MINOR][DOCS] Fix the example value in the docs

### What changes were proposed in this pull request?

fix the example value

### Why are the changes needed?

for doc

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

Just example value in the docs, no need to test.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43750 from jlfsdtc/fix_typo_in_doc.

Authored-by: longfei.jiang 
Signed-off-by: Kent Yao 
(cherry picked from commit b501a223bfcf4ddbcb0b2447aa06c549051630b0)
Signed-off-by: Kent Yao 
---
 docs/sql-ref-datetime-pattern.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-datetime-pattern.md b/docs/sql-ref-datetime-pattern.md
index 5e28a18acef..e5d5388f262 100644
--- a/docs/sql-ref-datetime-pattern.md
+++ b/docs/sql-ref-datetime-pattern.md
@@ -41,7 +41,7 @@ Spark uses pattern letters in the following table for date 
and timestamp parsing
 |**a**|am-pm-of-day|am-pm|PM|
 |**h**|clock-hour-of-am-pm (1-12)|number(2)|12|
 |**K**|hour-of-am-pm (0-11)|number(2)|0|
-|**k**|clock-hour-of-day (1-24)|number(2)|0|
+|**k**|clock-hour-of-day (1-24)|number(2)|1|
 |**H**|hour-of-day (0-23)|number(2)|0|
 |**m**|minute-of-hour|number(2)|30|
 |**s**|second-of-minute|number(2)|55|


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (0a791993be7 -> b501a223bfc)

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0a791993be7 
[SPARK-45686][INFRA][CORE][SQL][SS][CONNECT][MLLIB][DSTREAM][AVRO][ML][K8S][YARN][PYTHON][R][UI][GRAPHX][PROTOBUF][TESTS][EXAMPLES]
 Explicitly convert `Array` to `Seq` when function input is defined as `Seq` to 
avoid compilation warnings related to `class LowPriorityImplicits2 is 
deprecated`
 add b501a223bfc [MINOR][DOCS] Fix the example value in the docs

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-datetime-pattern.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.4 updated: [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6

2023-11-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 3978bf4528c [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6
3978bf4528c is described below

commit 3978bf4528c6ae58944d9fb3f8776ab570eeb7c8
Author: Dongjoon Hyun 
AuthorDate: Fri Nov 10 08:56:02 2023 -0800

[SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6

### What changes were proposed in this pull request?

This PR aims to upgrade ORC to 1.8.6 for Apache Spark 3.4.2.

### Why are the changes needed?

To bring the latest maintenance releases as a part of Apache Spark 3.4.2 
release
- https://github.com/apache/orc/releases/tag/v1.8.6

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43755 from dongjoon-hyun/SPARK-45884.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 6 +++---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++---
 pom.xml   | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index c562b0b7e16..691c83632b3 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -222,9 +222,9 @@ objenesis/3.2//objenesis-3.2.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.15.0//okio-1.15.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.8.5/shaded-protobuf/orc-core-1.8.5-shaded-protobuf.jar
-orc-mapreduce/1.8.5/shaded-protobuf/orc-mapreduce-1.8.5-shaded-protobuf.jar
-orc-shims/1.8.5//orc-shims-1.8.5.jar
+orc-core/1.8.6/shaded-protobuf/orc-core-1.8.6-shaded-protobuf.jar
+orc-mapreduce/1.8.6/shaded-protobuf/orc-mapreduce-1.8.6-shaded-protobuf.jar
+orc-shims/1.8.6//orc-shims-1.8.6.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index bcfc8c92b10..4d94cb5c699 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -209,9 +209,9 @@ opencsv/2.3//opencsv-2.3.jar
 opentracing-api/0.33.0//opentracing-api-0.33.0.jar
 opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
 opentracing-util/0.33.0//opentracing-util-0.33.0.jar
-orc-core/1.8.5/shaded-protobuf/orc-core-1.8.5-shaded-protobuf.jar
-orc-mapreduce/1.8.5/shaded-protobuf/orc-mapreduce-1.8.5-shaded-protobuf.jar
-orc-shims/1.8.5//orc-shims-1.8.5.jar
+orc-core/1.8.6/shaded-protobuf/orc-core-1.8.6-shaded-protobuf.jar
+orc-mapreduce/1.8.6/shaded-protobuf/orc-mapreduce-1.8.6-shaded-protobuf.jar
+orc-shims/1.8.6//orc-shims-1.8.6.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 6fa84df3d25..706a11d43b3 100644
--- a/pom.xml
+++ b/pom.xml
@@ -141,7 +141,7 @@
 
 10.14.2.0
 1.12.3
-1.8.5
+1.8.6
 shaded-protobuf
 9.4.50.v20221201
 4.0.3


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.3 updated: [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10

2023-11-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 6780c7857dc [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10
6780c7857dc is described below

commit 6780c7857dc9a0333fc12e8a25acff84bfacf2af
Author: Dongjoon Hyun 
AuthorDate: Fri Nov 10 08:54:38 2023 -0800

[SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10

### What changes were proposed in this pull request?

This PR aims to upgrade ORC to 1.7.10 for Apache Spark 3.3.4

### Why are the changes needed?

To bring the latest bug fixes.
- https://github.com/apache/orc/releases/tag/v1.7.9
- https://github.com/apache/orc/releases/tag/v1.7.10

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43756 from dongjoon-hyun/SPARK-45885.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 6 +++---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++---
 pom.xml   | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index 16ac8dc27c5..e5523d3350b 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -219,9 +219,9 @@ objenesis/3.2//objenesis-3.2.jar
 okhttp/3.12.12//okhttp-3.12.12.jar
 okio/1.14.0//okio-1.14.0.jar
 opencsv/2.3//opencsv-2.3.jar
-orc-core/1.7.8//orc-core-1.7.8.jar
-orc-mapreduce/1.7.8//orc-mapreduce-1.7.8.jar
-orc-shims/1.7.8//orc-shims-1.7.8.jar
+orc-core/1.7.10//orc-core-1.7.10.jar
+orc-mapreduce/1.7.10//orc-mapreduce-1.7.10.jar
+orc-shims/1.7.10//orc-shims-1.7.10.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 7c2dfb3faa0..86145c3051d 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -208,9 +208,9 @@ opencsv/2.3//opencsv-2.3.jar
 opentracing-api/0.33.0//opentracing-api-0.33.0.jar
 opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
 opentracing-util/0.33.0//opentracing-util-0.33.0.jar
-orc-core/1.7.8//orc-core-1.7.8.jar
-orc-mapreduce/1.7.8//orc-mapreduce-1.7.8.jar
-orc-shims/1.7.8//orc-shims-1.7.8.jar
+orc-core/1.7.10//orc-core-1.7.10.jar
+orc-mapreduce/1.7.10//orc-mapreduce-1.7.10.jar
+orc-shims/1.7.10//orc-shims-1.7.10.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 4aab8b5544e..b70ae68b1e4 100644
--- a/pom.xml
+++ b/pom.xml
@@ -133,7 +133,7 @@
 
 10.14.2.0
 1.12.2
-1.7.8
+1.7.10
 9.4.48.v20220622
 4.0.3
 0.10.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.5 updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2

2023-11-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 68b531dd2b4 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2
68b531dd2b4 is described below

commit 68b531dd2b485fa2203d6a2bd2de90afc97a13bb
Author: Dongjoon Hyun 
AuthorDate: Fri Nov 10 07:50:17 2023 -0800

[SPARK-45883][BUILD] Upgrade ORC to 1.9.2

### What changes were proposed in this pull request?

This PR aims to upgrade ORC to 1.9.2 for Apache Spark 4.0.0 and 3.5.1.

### Why are the changes needed?

To bring the latest bug fixes.
- https://github.com/apache/orc/releases/tag/v1.9.2

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43754 from dongjoon-hyun/SPARK-45883.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 917947e62e1e67f49a83c1ffb0833b61f0c48eb6)
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++---
 pom.xml   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 1d02f8dba56..9ab51dfa011 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -212,9 +212,9 @@ opencsv/2.3//opencsv-2.3.jar
 opentracing-api/0.33.0//opentracing-api-0.33.0.jar
 opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
 opentracing-util/0.33.0//opentracing-util-0.33.0.jar
-orc-core/1.9.1/shaded-protobuf/orc-core-1.9.1-shaded-protobuf.jar
-orc-mapreduce/1.9.1/shaded-protobuf/orc-mapreduce-1.9.1-shaded-protobuf.jar
-orc-shims/1.9.1//orc-shims-1.9.1.jar
+orc-core/1.9.2/shaded-protobuf/orc-core-1.9.2-shaded-protobuf.jar
+orc-mapreduce/1.9.2/shaded-protobuf/orc-mapreduce-1.9.2-shaded-protobuf.jar
+orc-shims/1.9.2//orc-shims-1.9.2.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index be8400c33bf..14e0ab3e0f6 100644
--- a/pom.xml
+++ b/pom.xml
@@ -141,7 +141,7 @@
 
 10.14.2.0
 1.13.1
-1.9.1
+1.9.2
 shaded-protobuf
 9.4.52.v20230823
 4.0.3


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2

2023-11-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 917947e62e1 [SPARK-45883][BUILD] Upgrade ORC to 1.9.2
917947e62e1 is described below

commit 917947e62e1e67f49a83c1ffb0833b61f0c48eb6
Author: Dongjoon Hyun 
AuthorDate: Fri Nov 10 07:50:17 2023 -0800

[SPARK-45883][BUILD] Upgrade ORC to 1.9.2

### What changes were proposed in this pull request?

This PR aims to upgrade ORC to 1.9.2 for Apache Spark 4.0.0 and 3.5.1.

### Why are the changes needed?

To bring the latest bug fixes.
- https://github.com/apache/orc/releases/tag/v1.9.2

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43754 from dongjoon-hyun/SPARK-45883.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 6 +++---
 pom.xml   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index b7d6bdbfd12..0a952aa6ee8 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -218,9 +218,9 @@ opencsv/2.3//opencsv-2.3.jar
 opentracing-api/0.33.0//opentracing-api-0.33.0.jar
 opentracing-noop/0.33.0//opentracing-noop-0.33.0.jar
 opentracing-util/0.33.0//opentracing-util-0.33.0.jar
-orc-core/1.9.1/shaded-protobuf/orc-core-1.9.1-shaded-protobuf.jar
-orc-mapreduce/1.9.1/shaded-protobuf/orc-mapreduce-1.9.1-shaded-protobuf.jar
-orc-shims/1.9.1//orc-shims-1.9.1.jar
+orc-core/1.9.2/shaded-protobuf/orc-core-1.9.2-shaded-protobuf.jar
+orc-mapreduce/1.9.2/shaded-protobuf/orc-mapreduce-1.9.2-shaded-protobuf.jar
+orc-shims/1.9.2//orc-shims-1.9.2.jar
 oro/2.0.8//oro-2.0.8.jar
 osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
 paranamer/2.8//paranamer-2.8.jar
diff --git a/pom.xml b/pom.xml
index 71d8ffcc1c9..14754c0bcaa 100644
--- a/pom.xml
+++ b/pom.xml
@@ -141,7 +141,7 @@
 
 10.14.2.0
 1.13.1
-1.9.1
+1.9.2
 shaded-protobuf
 9.4.53.v20231009
 4.0.3


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix `Passing an explicit array value to a Scala varargs method is deprecated`

2023-11-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 605aa0c299c 
[SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED 
STREAMING] Fix `Passing an explicit array value to a Scala varargs method is 
deprecated`
605aa0c299c is described below

commit 605aa0c299c1d88f8a31ba888ac8e6b6203be6c5
Author: Tengfei Huang 
AuthorDate: Fri Nov 10 08:10:20 2023 -0600


[SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED 
STREAMING] Fix `Passing an explicit array value to a Scala varargs method is 
deprecated`

### What changes were proposed in this pull request?
Fix the deprecated behavior below:
`Passing an explicit array value to a Scala varargs method is deprecated 
(since 2.13.0) and will result in a defensive copy; Use the more efficient 
non-copying ArraySeq.unsafeWrapArray or an explicit toIndexedSeq call`

For all the use cases, we don't need to make a copy of the array. 
Explicitly use `ArraySeq.unsafeWrapArray` to do the conversion.

### Why are the changes needed?
Eliminate compile warnings and no longer use deprecated scala APIs.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GA.
Fixed all the warning with build: `mvn clean package -DskipTests 
-Pspark-ganglia-lgpl -Pkinesis-asl -Pdocker-integration-tests -Pyarn 
-Pkubernetes -Pkubernetes-integration-tests -Phive-thriftserver -Phadoop-cloud`

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43642 from ivoson/SPARK-45687.

Authored-by: Tengfei Huang 
Signed-off-by: Sean Owen 
---
 .../scala/org/apache/spark/sql/KeyValueGroupedDataset.scala  |  9 ++---
 .../test/scala/org/apache/spark/sql/ColumnTestSuite.scala|  3 ++-
 .../apache/spark/sql/UserDefinedFunctionE2ETestSuite.scala   |  5 -
 .../spark/sql/connect/planner/SparkConnectPlanner.scala  |  3 ++-
 .../main/scala/org/apache/spark/api/python/PythonRDD.scala   |  3 ++-
 core/src/main/scala/org/apache/spark/executor/Executor.scala |  3 ++-
 core/src/main/scala/org/apache/spark/rdd/RDD.scala   |  3 ++-
 .../scala/org/apache/spark/examples/graphx/Analytics.scala   |  4 ++--
 .../scala/org/apache/spark/ml/classification/OneVsRest.scala |  3 ++-
 .../scala/org/apache/spark/ml/feature/FeatureHasher.scala|  4 +++-
 .../src/main/scala/org/apache/spark/ml/feature/Imputer.scala |  8 +---
 .../main/scala/org/apache/spark/ml/feature/Interaction.scala |  4 +++-
 .../main/scala/org/apache/spark/ml/feature/RFormula.scala|  6 --
 .../scala/org/apache/spark/ml/feature/VectorAssembler.scala  |  5 +++--
 mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala  |  3 ++-
 .../src/main/scala/org/apache/spark/ml/fpm/PrefixSpan.scala  |  3 ++-
 .../src/main/scala/org/apache/spark/ml/r/KSTestWrapper.scala |  3 ++-
 .../apache/spark/ml/regression/DecisionTreeRegressor.scala   |  3 ++-
 .../src/main/scala/org/apache/spark/ml/tree/treeModels.scala |  3 ++-
 .../src/main/scala/org/apache/spark/mllib/util/MLUtils.scala | 12 
 .../scala/org/apache/spark/ml/feature/ImputerSuite.scala | 12 
 .../apache/spark/ml/source/image/ImageFileFormatSuite.scala  |  3 ++-
 .../apache/spark/ml/stat/KolmogorovSmirnovTestSuite.scala|  3 ++-
 mllib/src/test/scala/org/apache/spark/ml/util/MLTest.scala   |  6 --
 .../deploy/k8s/features/DriverCommandFeatureStepSuite.scala  |  2 +-
 .../apache/spark/sql/catalyst/expressions/generators.scala   |  8 ++--
 .../sql/catalyst/expressions/UnsafeRowConverterSuite.scala   |  4 +++-
 .../scala/org/apache/spark/sql/DataFrameStatFunctions.scala  |  3 ++-
 .../scala/org/apache/spark/sql/KeyValueGroupedDataset.scala  |  8 ++--
 .../spark/sql/execution/datasources/jdbc/JDBCRDD.scala   |  2 +-
 .../org/apache/spark/sql/execution/stat/StatFunctions.scala  |  3 ++-
 .../apache/spark/sql/execution/streaming/OffsetSeqLog.scala  |  3 ++-
 .../streaming/continuous/ContinuousRateStreamSource.scala|  3 ++-
 .../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala |  3 ++-
 .../src/test/scala/org/apache/spark/sql/DatasetSuite.scala   |  6 --
 .../src/test/scala/org/apache/spark/sql/GenTPCDSData.scala   |  3 ++-
 .../test/scala/org/apache/spark/sql/ParametersSuite.scala|  9 +
 .../spark/sql/connector/SimpleWritableDataSource.scala   |  4 +++-
 .../sql/execution/datasources/FileMetadataStructSuite.scala  |  3 ++-
 .../spark/sql/execution/datasources/csv/CSVBenchmark.scala   |  7 ---
 .../scala/org/apache/spark/sql/streaming/StreamSuite.scala   |  2 +-
 .../org/apache/spark/sql/streaming/StreamingQuerySuite.scala |  3 ++-
 .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala|  3 ++-

(spark) branch master updated: [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations

2023-11-10 Thread beliefer

This is an automated email from the ASF dual-hosted git repository.

beliefer pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6851cb96ec6 [SPARK-45752][SQL] Simplify the code for check 
unreferenced CTE relations
6851cb96ec6 is described below

commit 6851cb96ec651b25a8103f7681e8528ff7d625ff
Author: Jiaan Geng 
AuthorDate: Fri Nov 10 22:00:51 2023 +0800

[SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations

### What changes were proposed in this pull request?
https://github.com/apache/spark/pull/43614 let unreferenced `CTE` checked 
by `CheckAnalysis0`.
This PR follows up https://github.com/apache/spark/pull/43614 to simplify 
the code for check unreferenced CTE relations.

### Why are the changes needed?
Simplify the code for check unreferenced CTE relations

### Does this PR introduce _any_ user-facing change?
'No'.

### How was this patch tested?
Exists test cases.

### Was this patch authored or co-authored using generative AI tooling?
'No'.

Closes #43727 from beliefer/SPARK-45752_followup.

Authored-by: Jiaan Geng 
Signed-off-by: Jiaan Geng 
---
 .../spark/sql/catalyst/analysis/CheckAnalysis.scala| 12 
 .../scala/org/apache/spark/sql/CTEInlineSuite.scala| 18 --
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 29d60ae0f41..f9010d47508 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -167,25 +167,21 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 val inlineCTE = InlineCTE(alwaysInline = true)
 val cteMap = mutable.HashMap.empty[Long, (CTERelationDef, Int, 
mutable.Map[Long, Int])]
 inlineCTE.buildCTEMap(plan, cteMap)
-cteMap.values.foreach { case (relation, _, _) =>
+val visited: mutable.Map[Long, Boolean] = 
mutable.Map.empty.withDefaultValue(false)
+cteMap.foreach { case (cteId, (relation, refCount, _)) =>
   // If a CTE relation is never used, it will disappear after inline. Here 
we explicitly check
   // analysis for it, to make sure the entire query plan is valid.
   try {
 // If a CTE relation ref count is 0, the other CTE relations that 
reference it
 // should also be checked by checkAnalysis0. This code will also 
guarantee the leaf
 // relations that do not reference any others are checked first.
-val visited: mutable.Map[Long, Boolean] = 
mutable.Map.empty.withDefaultValue(false)
-cteMap.foreach { case (cteId, _) =>
-  val (_, refCount, _) = cteMap(cteId)
-  if (refCount == 0) {
-checkUnreferencedCTERelations(cteMap, visited, cteId)
-  }
+if (refCount == 0) {
+  checkUnreferencedCTERelations(cteMap, visited, cteId)
 }
   } catch {
 case e: AnalysisException =>
   throw new ExtendedAnalysisException(e, relation.child)
   }
-
 }
 // Inline all CTEs in the plan to help check query plan structures in 
subqueries.
 var inlinedPlan: Option[LogicalPlan] = None
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala
index 055c04992c0..a06b50d175f 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/CTEInlineSuite.scala
@@ -683,11 +683,25 @@ abstract class CTEInlineSuiteBase
 val e = intercept[AnalysisException](sql(
   s"""
 |with
-|a as (select * from non_exist),
+|a as (select * from tab_non_exists),
 |b as (select * from a)
 |select 2
 |""".stripMargin))
-checkErrorTableNotFound(e, "`non_exist`", ExpectedContext("non_exist", 26, 
34))
+checkErrorTableNotFound(e, "`tab_non_exists`", 
ExpectedContext("tab_non_exists", 26, 39))
+
+withTable("tab_exists") {
+  spark.sql("CREATE TABLE tab_exists(id INT) using parquet")
+  val e = intercept[AnalysisException](sql(
+s"""
+   |with
+   |a as (select * from tab_exists),
+   |b as (select * from a),
+   |c as (select * from tab_non_exists),
+   |d as (select * from c)
+   |select 2
+   |""".stripMargin))
+  checkErrorTableNotFound(e, "`tab_non_exists`", 
ExpectedContext("tab_non_exists", 83, 96))
+}
   }
 }
 


-
To unsubscribe, e-mail:

(spark) branch branch-3.4 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 61868525785 [SPARK-45878][SQL][TESTS] Fix 
ConcurrentModificationException in CliSuite
61868525785 is described below

commit 618685257859d5a34b50e22905c73f639bd2cc30
Author: Kent Yao 
AuthorDate: Fri Nov 10 21:09:43 2023 +0800

[SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

### What changes were proposed in this pull request?

This PR changes the ArrayBuffer for logs to immutable for reading to 
prevent ConcurrentModificationException which hides the actual cause of failure

### Why are the changes needed?

```scala
[info] - SPARK-29022 Commands using SerDe provided in ADD JAR sql *** 
FAILED *** (11 seconds, 105 milliseconds)
[info]   java.util.ConcurrentModificationException: mutation occurred 
during iteration
[info]   at 
scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43)
[info]   at 
scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47)
[info]   at 
scala.collection.IterableOnceOps.addString(IterableOnce.scala:1247)
[info]   at 
scala.collection.IterableOnceOps.addString$(IterableOnce.scala:1241)
[info]   at scala.collection.AbstractIterable.addString(Iterable.scala:933)
[info]   at 
org.apache.spark.sql.hive.thriftserver.CliSuite.runCliWithin(CliSuite.scala:205)
[info]   at 
org.apache.spark.sql.hive.thriftserver.CliSuite.$anonfun$new$20(CliSuite.scala:501)
```

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #43749 from yaooqinn/SPARK-45878.

Authored-by: Kent Yao 
Signed-off-by: Kent Yao 
(cherry picked from commit b347237735094e9092f4100583ed1d6f3eacf1f6)
Signed-off-by: Kent Yao 
---
 .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 1b91442c228..d3721cf68ab 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -198,7 +198,7 @@ class CliSuite extends SparkFunSuite {
   ThreadUtils.awaitResult(foundAllExpectedAnswers.future, timeoutForQuery)
   log.info("Found all expected output.")
 } catch { case cause: Throwable =>
-  val message =
+  val message = lock.synchronized {
 s"""
|===
|CliSuite failure output
@@ -212,6 +212,7 @@ class CliSuite extends SparkFunSuite {
|End CliSuite failure output
|===
  """.stripMargin
+  }
   logError(message, cause)
   fail(message, cause)
 } finally {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (49ca6aa6cb7 -> b3472377350)

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 49ca6aa6cb7 [MINOR][SQL] Pass `cause` in 
`CannotReplaceMissingTableException` costructor
 add b3472377350 [SPARK-45878][SQL][TESTS] Fix 
ConcurrentModificationException in CliSuite

No new revisions were added by this update.

Summary of changes:
 .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.5 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

2023-11-10 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 0b68e1700f6 [SPARK-45878][SQL][TESTS] Fix 
ConcurrentModificationException in CliSuite
0b68e1700f6 is described below

commit 0b68e1700f60ad1a32f066c10a0f76bea893b7ce
Author: Kent Yao 
AuthorDate: Fri Nov 10 21:09:43 2023 +0800

[SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

### What changes were proposed in this pull request?

This PR changes the ArrayBuffer for logs to immutable for reading to 
prevent ConcurrentModificationException which hides the actual cause of failure

### Why are the changes needed?

```scala
[info] - SPARK-29022 Commands using SerDe provided in ADD JAR sql *** 
FAILED *** (11 seconds, 105 milliseconds)
[info]   java.util.ConcurrentModificationException: mutation occurred 
during iteration
[info]   at 
scala.collection.mutable.MutationTracker$.checkMutations(MutationTracker.scala:43)
[info]   at 
scala.collection.mutable.CheckedIndexedSeqView$CheckedIterator.hasNext(CheckedIndexedSeqView.scala:47)
[info]   at 
scala.collection.IterableOnceOps.addString(IterableOnce.scala:1247)
[info]   at 
scala.collection.IterableOnceOps.addString$(IterableOnce.scala:1241)
[info]   at scala.collection.AbstractIterable.addString(Iterable.scala:933)
[info]   at 
org.apache.spark.sql.hive.thriftserver.CliSuite.runCliWithin(CliSuite.scala:205)
[info]   at 
org.apache.spark.sql.hive.thriftserver.CliSuite.$anonfun$new$20(CliSuite.scala:501)
```

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #43749 from yaooqinn/SPARK-45878.

Authored-by: Kent Yao 
Signed-off-by: Kent Yao 
(cherry picked from commit b347237735094e9092f4100583ed1d6f3eacf1f6)
Signed-off-by: Kent Yao 
---
 .../test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 343b32e6227..38dcd1d8b00 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -193,7 +193,7 @@ class CliSuite extends SparkFunSuite {
   ThreadUtils.awaitResult(foundAllExpectedAnswers.future, timeoutForQuery)
   log.info("Found all expected output.")
 } catch { case cause: Throwable =>
-  val message =
+  val message = lock.synchronized {
 s"""
|===
|CliSuite failure output
@@ -207,6 +207,7 @@ class CliSuite extends SparkFunSuite {
|End CliSuite failure output
|===
  """.stripMargin
+  }
   logError(message, cause)
   fail(message, cause)
 } finally {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor

2023-11-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 49ca6aa6cb7 [MINOR][SQL] Pass `cause` in 
`CannotReplaceMissingTableException` costructor
49ca6aa6cb7 is described below

commit 49ca6aa6cb75b931d1c38dcffb4cd3dd63b0a2f3
Author: Max Gekk 
AuthorDate: Fri Nov 10 12:17:09 2023 +0300

[MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor

### What changes were proposed in this pull request?
In the PR, I propose to use the `cause` argument in the 
`CannotReplaceMissingTableException` constructor.

### Why are the changes needed?
To improve user experience with Spark SQL while troubleshooting issues. 
Currently, users don't see where the exception come from.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manually.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #43738 from MaxGekk/fix-missed-cause.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../sql/catalyst/analysis/CannotReplaceMissingTableException.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala
index 910bb9d3749..032cdca12c0 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala
@@ -28,4 +28,5 @@ class CannotReplaceMissingTableException(
   extends AnalysisException(
   errorClass = "TABLE_OR_VIEW_NOT_FOUND",
   messageParameters = Map("relationName"
--> quoteNameParts(tableIdentifier.namespace :+ tableIdentifier.name)))
+-> quoteNameParts(tableIdentifier.namespace :+ tableIdentifier.name)),
+  cause = cause)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated (f5b1b8306cf -> bd526986a5f)

2023-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f5b1b8306cf [SPARK-45562][SQL] XML: Add SQL error class for missing 
rowTag option
 add bd526986a5f [SPARK-45837][CONNECT] Improve logging information in 
handling retries

No new revisions were added by this update.

Summary of changes:
 .../sql/connect/client/ExecutePlanResponseReattachableIterator.scala  | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch master updated: [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option

2023-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f5b1b8306cf [SPARK-45562][SQL] XML: Add SQL error class for missing 
rowTag option
f5b1b8306cf is described below

commit f5b1b8306cf13218f5ff79944aaa9c0b4e74fda4
Author: Sandip Agarwala <131817656+sandip...@users.noreply.github.com>
AuthorDate: Fri Nov 10 17:44:39 2023 +0900

[SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option

### What changes were proposed in this pull request?
rowTag option is required for reading XML files. This PR adds a SQL error 
class for missing rowTag option.

### Why are the changes needed?
rowTag option is required for reading XML files. This PR adds a SQL error 
class for missing rowTag option.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Updated the unit test to check for error message.

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43710 from sandip-db/xml-rowTagRequiredError.

Authored-by: Sandip Agarwala <131817656+sandip...@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon 
---
 common/utils/src/main/resources/error/error-classes.json  |  6 ++
 docs/sql-error-conditions.md  |  6 ++
 .../scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala  |  8 ++--
 .../org/apache/spark/sql/errors/QueryCompilationErrors.scala  |  7 +++
 .../apache/spark/sql/execution/datasources/xml/XmlSuite.scala | 11 ---
 5 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 26f6c0240af..3b7a3a6006e 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -3911,6 +3911,12 @@
 },
 "sqlState" : "42605"
   },
+  "XML_ROW_TAG_MISSING" : {
+"message" : [
+  " option is required for reading files in XML format."
+],
+"sqlState" : "42000"
+  },
   "_LEGACY_ERROR_TEMP_0001" : {
 "message" : [
   "Invalid InsertIntoContext."
diff --git a/docs/sql-error-conditions.md b/docs/sql-error-conditions.md
index 2cb433b19fa..a811019e0a5 100644
--- a/docs/sql-error-conditions.md
+++ b/docs/sql-error-conditions.md
@@ -2369,3 +2369,9 @@ The operation `` requires a ``. 
But `` is a
 The `` requires `` parameters but the actual number 
is ``.
 
 For more details see 
[WRONG_NUM_ARGS](sql-error-conditions-wrong-num-args-error-class.html)
+
+### XML_ROW_TAG_MISSING
+
+[SQLSTATE: 
42000](sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation)
+
+`` option is required for reading files in XML format.
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala
index aac6eec21c6..8f6cdbf360e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/XmlOptions.scala
@@ -24,7 +24,7 @@ import javax.xml.stream.XMLInputFactory
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.{DataSourceOptions, FileSourceOptions}
 import org.apache.spark.sql.catalyst.util.{CaseInsensitiveMap, 
CompressionCodecs, DateFormatter, DateTimeUtils, ParseMode, PermissiveMode}
-import org.apache.spark.sql.errors.QueryExecutionErrors
+import org.apache.spark.sql.errors.{QueryCompilationErrors, 
QueryExecutionErrors}
 import org.apache.spark.sql.internal.{LegacyBehaviorPolicy, SQLConf}
 
 /**
@@ -66,7 +66,11 @@ private[sql] class XmlOptions(
 
   val compressionCodec = 
parameters.get(COMPRESSION).map(CompressionCodecs.getCodecClassName)
   val rowTagOpt = parameters.get(XmlOptions.ROW_TAG).map(_.trim)
-  require(!rowTagRequired || rowTagOpt.isDefined, s"'${XmlOptions.ROW_TAG}' 
option is required.")
+
+  if (rowTagRequired && rowTagOpt.isEmpty) {
+throw QueryCompilationErrors.xmlRowTagRequiredError(XmlOptions.ROW_TAG)
+  }
+
   val rowTag = rowTagOpt.getOrElse(XmlOptions.DEFAULT_ROW_TAG)
   require(rowTag.nonEmpty, s"'$ROW_TAG' option should not be an empty string.")
   require(!rowTag.startsWith("<") && !rowTag.endsWith(">"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 0c5dcb1ead0..e772b3497ac 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -3817,4 +3817,11

(spark) branch master updated: [SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during logging

2023-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9bac48d4bd6 [SPARK-45852][CONNECT][PYTHON] Gracefully deal with 
recursion error during logging
9bac48d4bd6 is described below

commit 9bac48d4bd68d4f0d54c53c29a27b1f6e02c5f61
Author: Martin Grund 
AuthorDate: Fri Nov 10 17:12:25 2023 +0900

[SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during 
logging

### What changes were proposed in this pull request?
The Python client for Spark connect logs the text representation of the 
proto message. However, for deeply nested objects this can lead to a Python 
recursion error even before the maximum nested recursion limit of the GRPC 
message is reached.

This patch fixes this issue by explicitly catching the recursion error 
during text conversion.

### Why are the changes needed?
Stability

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
UT

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43732 from grundprinzip/SPARK-45852.

Authored-by: Martin Grund 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/connect/client/core.py  |  5 -
 python/pyspark/sql/tests/connect/test_connect_basic.py | 13 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/sql/connect/client/core.py 
b/python/pyspark/sql/connect/client/core.py
index 965c4107cac..7eafcc501f5 100644
--- a/python/pyspark/sql/connect/client/core.py
+++ b/python/pyspark/sql/connect/client/core.py
@@ -935,7 +935,10 @@ class SparkConnectClient(object):
 ---
 Single line string of the serialized proto message.
 """
-return text_format.MessageToString(p, as_one_line=True)
+try:
+return text_format.MessageToString(p, as_one_line=True)
+except RecursionError:
+return ""
 
 def schema(self, plan: pb2.Plan) -> StructType:
 """
diff --git a/python/pyspark/sql/tests/connect/test_connect_basic.py 
b/python/pyspark/sql/tests/connect/test_connect_basic.py
index daf6772e52b..7a224d68219 100755
--- a/python/pyspark/sql/tests/connect/test_connect_basic.py
+++ b/python/pyspark/sql/tests/connect/test_connect_basic.py
@@ -159,6 +159,19 @@ class SparkConnectSQLTestCase(ReusedConnectTestCase, 
SQLTestUtils, PandasOnSpark
 
 
 class SparkConnectBasicTests(SparkConnectSQLTestCase):
+def test_recursion_handling_for_plan_logging(self):
+"""SPARK-45852 - Test that we can handle recursion in plan logging."""
+cdf = self.connect.range(1)
+for x in range(400):
+cdf = cdf.withColumn(f"col_{x}", CF.lit(x))
+
+# Calling schema will trigger logging the message that will in turn 
trigger the message
+# conversion into protobuf that will then trigger the recursion error.
+self.assertIsNotNone(cdf.schema)
+
+result = 
self.connect._client._proto_to_string(cdf._plan.to_proto(self.connect._client))
+self.assertIn("recursion", result)
+
 def test_df_getattr_behavior(self):
 cdf = self.connect.range(10)
 sdf = self.spark.range(10)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

(spark) branch branch-3.3 updated: [MINOR][DOCS] Fix the example value in the docs

(spark) branch branch-3.4 updated: [MINOR][DOCS] Fix the example value in the docs

(spark) branch branch-3.5 updated: [MINOR][DOCS] Fix the example value in the docs

(spark) branch master updated (0a791993be7 -> b501a223bfc)

(spark) branch branch-3.4 updated: [SPARK-45884][BUILD][3.4] Upgrade ORC to 1.8.6

(spark) branch branch-3.3 updated: [SPARK-45885][BUILD][3.3] Upgrade ORC to 1.7.10

(spark) branch branch-3.5 updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2

(spark) branch master updated: [SPARK-45883][BUILD] Upgrade ORC to 1.9.2

(spark) branch master updated: [SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix `Passing an explicit array value to a Scala varargs method is deprecated`

(spark) branch master updated: [SPARK-45752][SQL] Simplify the code for check unreferenced CTE relations

(spark) branch branch-3.4 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

(spark) branch master updated (49ca6aa6cb7 -> b3472377350)

(spark) branch branch-3.5 updated: [SPARK-45878][SQL][TESTS] Fix ConcurrentModificationException in CliSuite

(spark) branch master updated: [MINOR][SQL] Pass `cause` in `CannotReplaceMissingTableException` costructor

(spark) branch master updated (f5b1b8306cf -> bd526986a5f)

(spark) branch master updated: [SPARK-45562][SQL] XML: Add SQL error class for missing rowTag option

(spark) branch master updated: [SPARK-45852][CONNECT][PYTHON] Gracefully deal with recursion error during logging

17 matches

Site Navigation

Mail list logo

Footer information