date:20230924

[spark] branch branch-3.5 updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid

2023-09-24 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 94661758c30 [SPARK-45291][SQL][REST] Use unknown query execution id 
instead of no such app when id is invalid
94661758c30 is described below

commit 94661758c3072a279a29d0c493ce419af0414d3a
Author: Kent Yao 
AuthorDate: Mon Sep 25 14:23:46 2023 +0800

[SPARK-45291][SQL][REST] Use unknown query execution id instead of no such 
app when id is invalid

### What changes were proposed in this pull request?

This PR fixes `/api/v1/applications/{appId}/sql/{executionId}` API when the 
executionId is invalid.

Before this, we get `no such app: $appId`; after this, we get `unknown 
query execution id: $executionId`

### Why are the changes needed?

bugfix

### Does this PR introduce _any_ user-facing change?

no, bugfix

### How was this patch tested?

new test
### Was this patch authored or co-authored using generative AI tooling?

no

Closes #43073 from yaooqinn/SPARK-45291.

Authored-by: Kent Yao 
Signed-off-by: Kent Yao 
(cherry picked from commit 5d422155f1dae09f1631375d09e2f3c8dffba9a5)
Signed-off-by: Kent Yao 
---
 .../scala/org/apache/spark/status/api/v1/sql/SqlResource.scala   | 3 +--
 .../status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala 
b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
index 3c96f612da6..fa5bea5f9bb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
@@ -56,10 +56,9 @@ private[v1] class SqlResource extends BaseAppResource {
   planDescription: Boolean): ExecutionData = {
 withUI { ui =>
   val sqlStore = new SQLAppStatusStore(ui.store.store)
-  val graph = sqlStore.planGraph(execId)
   sqlStore
 .execution(execId)
-.map(prepareExecutionData(_, graph, details, planDescription))
+.map(prepareExecutionData(_, sqlStore.planGraph(execId), details, 
planDescription))
 .getOrElse(throw new NotFoundException("unknown query execution id: " 
+ execId))
 }
   }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
index 658f79fc289..c63c748953f 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
@@ -19,6 +19,7 @@ package org.apache.spark.status.api.v1.sql
 
 import java.net.URL
 import java.text.SimpleDateFormat
+import javax.servlet.http.HttpServletResponse
 
 import org.json4s.DefaultFormats
 import org.json4s.jackson.JsonMethods
@@ -148,4 +149,12 @@ class SqlResourceWithActualMetricsSuite
 }
   }
 
+  test("SPARK-45291: Use unknown query execution id instead of no such app 
when id is invalid") {
+val url = new URL(spark.sparkContext.ui.get.webUrl +
+  
s"/api/v1/applications/${spark.sparkContext.applicationId}/sql/${Long.MaxValue}")
+val (code, resultOpt, error) = getContentAndCode(url)
+assert(code === HttpServletResponse.SC_NOT_FOUND)
+assert(resultOpt.isEmpty)
+assert(error.get === s"unknown query execution id: ${Long.MaxValue}")
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid

2023-09-24 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5d422155f1d [SPARK-45291][SQL][REST] Use unknown query execution id 
instead of no such app when id is invalid
5d422155f1d is described below

commit 5d422155f1dae09f1631375d09e2f3c8dffba9a5
Author: Kent Yao 
AuthorDate: Mon Sep 25 14:23:46 2023 +0800

[SPARK-45291][SQL][REST] Use unknown query execution id instead of no such 
app when id is invalid

### What changes were proposed in this pull request?

This PR fixes `/api/v1/applications/{appId}/sql/{executionId}` API when the 
executionId is invalid.

Before this, we get `no such app: $appId`; after this, we get `unknown 
query execution id: $executionId`

### Why are the changes needed?

bugfix

### Does this PR introduce _any_ user-facing change?

no, bugfix

### How was this patch tested?

new test
### Was this patch authored or co-authored using generative AI tooling?

no

Closes #43073 from yaooqinn/SPARK-45291.

Authored-by: Kent Yao 
Signed-off-by: Kent Yao 
---
 .../scala/org/apache/spark/status/api/v1/sql/SqlResource.scala   | 3 +--
 .../status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala 
b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
index 3c96f612da6..fa5bea5f9bb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/status/api/v1/sql/SqlResource.scala
@@ -56,10 +56,9 @@ private[v1] class SqlResource extends BaseAppResource {
   planDescription: Boolean): ExecutionData = {
 withUI { ui =>
   val sqlStore = new SQLAppStatusStore(ui.store.store)
-  val graph = sqlStore.planGraph(execId)
   sqlStore
 .execution(execId)
-.map(prepareExecutionData(_, graph, details, planDescription))
+.map(prepareExecutionData(_, sqlStore.planGraph(execId), details, 
planDescription))
 .getOrElse(throw new NotFoundException("unknown query execution id: " 
+ execId))
 }
   }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
index 658f79fc289..c63c748953f 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/status/api/v1/sql/SqlResourceWithActualMetricsSuite.scala
@@ -19,6 +19,7 @@ package org.apache.spark.status.api.v1.sql
 
 import java.net.URL
 import java.text.SimpleDateFormat
+import javax.servlet.http.HttpServletResponse
 
 import org.json4s.DefaultFormats
 import org.json4s.jackson.JsonMethods
@@ -148,4 +149,12 @@ class SqlResourceWithActualMetricsSuite
 }
   }
 
+  test("SPARK-45291: Use unknown query execution id instead of no such app 
when id is invalid") {
+val url = new URL(spark.sparkContext.ui.get.webUrl +
+  
s"/api/v1/applications/${spark.sparkContext.applicationId}/sql/${Long.MaxValue}")
+val (code, resultOpt, error) = getContentAndCode(url)
+assert(code === HttpServletResponse.SC_NOT_FOUND)
+assert(resultOpt.isEmpty)
+assert(error.get === s"unknown query execution id: ${Long.MaxValue}")
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

2023-09-24 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new fb2bee37c96 [SPARK-42617][PS] Support `isocalendar` from the pandas 
2.0.0
fb2bee37c96 is described below

commit fb2bee37c964bf2164fc89a0a55085dd0c840b56
Author: zhyhimont 
AuthorDate: Mon Sep 25 15:22:32 2023 +0900

[SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

### What changes were proposed in this pull request?

Support `isocalendar` from the pandas 2.0.0

### Why are the changes needed?

When pandas 2.0.0 is released, we should match the behavior in pandas API 
on Spark.

### Does this PR introduce _any_ user-facing change?

Added new method `DatetimeIndex.isocalendar` and removed two depreceted 
`DatetimeIndex.week` and `DatetimeIndex.weekofyear`
```
dfs = ps.from_pandas(pd.date_range(start='2019-12-29', freq='D', 
periods=4).to_series())
dfs.dt.isocalendar()
year  week  day
2019-12-29  2019527
2019-12-30  2020 11
2019-12-31  2020 12
2020-01-01  2020 13
dfs.dt.isocalendar().week
2019-12-2952
2019-12-30 1
2019-12-31 1
2020-01-01 1
```

### How was this patch tested?

UT was updated

Closes #40420 from dzhigimont/SPARK-42617_ZH.

Lead-authored-by: zhyhimont 
Co-authored-by: Zhyhimont Dmitry 
Co-authored-by: Dmitry Zhyhimont 
Co-authored-by: Zhyhimont Dmitry 
Signed-off-by: Hyukjin Kwon 
---
 .../source/reference/pyspark.pandas/indexing.rst   |  3 +-
 .../source/reference/pyspark.pandas/series.rst |  3 +-
 python/pyspark/pandas/datetimes.py | 70 --
 python/pyspark/pandas/indexes/base.py  |  4 +-
 python/pyspark/pandas/indexes/datetimes.py | 49 +--
 python/pyspark/pandas/namespace.py |  3 +-
 .../pyspark/pandas/tests/indexes/test_datetime.py  | 28 ++---
 .../pandas/tests/indexes/test_datetime_property.py | 19 +-
 .../pyspark/pandas/tests/test_series_datetime.py   | 17 +-
 9 files changed, 100 insertions(+), 96 deletions(-)

diff --git a/python/docs/source/reference/pyspark.pandas/indexing.rst 
b/python/docs/source/reference/pyspark.pandas/indexing.rst
index 70d463c052a..d6be57ee9c8 100644
--- a/python/docs/source/reference/pyspark.pandas/indexing.rst
+++ b/python/docs/source/reference/pyspark.pandas/indexing.rst
@@ -338,8 +338,7 @@ Time/date components
DatetimeIndex.minute
DatetimeIndex.second
DatetimeIndex.microsecond
-   DatetimeIndex.week
-   DatetimeIndex.weekofyear
+   DatetimeIndex.isocalendar
DatetimeIndex.dayofweek
DatetimeIndex.day_of_week
DatetimeIndex.weekday
diff --git a/python/docs/source/reference/pyspark.pandas/series.rst 
b/python/docs/source/reference/pyspark.pandas/series.rst
index 552acec096f..7b658d45d4b 100644
--- a/python/docs/source/reference/pyspark.pandas/series.rst
+++ b/python/docs/source/reference/pyspark.pandas/series.rst
@@ -313,8 +313,7 @@ Datetime Properties
Series.dt.minute
Series.dt.second
Series.dt.microsecond
-   Series.dt.week
-   Series.dt.weekofyear
+   Series.dt.isocalendar
Series.dt.dayofweek
Series.dt.weekday
Series.dt.dayofyear
diff --git a/python/pyspark/pandas/datetimes.py 
b/python/pyspark/pandas/datetimes.py
index b0649cf5761..4b6e23fae7a 100644
--- a/python/pyspark/pandas/datetimes.py
+++ b/python/pyspark/pandas/datetimes.py
@@ -18,7 +18,6 @@
 """
 Date/Time related functions on pandas-on-Spark Series
 """
-import warnings
 from typing import Any, Optional, Union, no_type_check
 
 import numpy as np
@@ -27,7 +26,9 @@ from pandas.tseries.offsets import DateOffset
 
 import pyspark.pandas as ps
 import pyspark.sql.functions as F
-from pyspark.sql.types import DateType, TimestampType, TimestampNTZType, 
LongType, IntegerType
+from pyspark.sql.types import DateType, TimestampType, TimestampNTZType, 
IntegerType
+from pyspark.pandas import DataFrame
+from pyspark.pandas.config import option_context
 
 
 class DatetimeMethods:
@@ -116,26 +117,59 @@ class DatetimeMethods:
 def nanosecond(self) -> "ps.Series":
 raise NotImplementedError()
 
-# TODO(SPARK-42617): Support isocalendar.week and replace it.
-# See also https://github.com/pandas-dev/pandas/pull/33595.
-@property
-def week(self) -> "ps.Series":
+def isocalendar(self) -> "ps.DataFrame":
 """
-The week ordinal of the year.
+Calculate year, week, and day according to the ISO 8601 standard.
 
-.. deprecated:: 3.4.0
-"""
-warnings.warn(
-"weekofyear and week have been deprecated.",
-FutureWarning,
-)
-

[GitHub] [spark-website] panbingkun commented on pull request #474: [SPARK-44820][DOCS] Switch languages consistently across docs for all code snippets

2023-09-24 Thread via GitHub



panbingkun commented on PR #474:
URL: https://github.com/apache/spark-website/pull/474#issuecomment-1732813770

   > @panbingkun yes let's update the spark website (this repo) to fix this UI 
issue for published docs.
   
   Okay, let me to fix it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f81f51467b8 -> bb0d287114f)

2023-09-24 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f81f51467b8 [SPARK-45257][CORE][FOLLOWUP] Correct the from version in 
migration guide
 add bb0d287114f [SPARK-45294][PYTHON][DOCS] Use JDK 17 in Binder 
integration for PySpark live notebooks

No new revisions were added by this update.

Summary of changes:
 binder/apt.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-45257][CORE][FOLLOWUP] Correct the from version in migration guide

2023-09-24 Thread yao

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f81f51467b8 [SPARK-45257][CORE][FOLLOWUP] Correct the from version in 
migration guide
f81f51467b8 is described below

commit f81f51467b85779086873860d5bac0d5429c9a29
Author: Cheng Pan 
AuthorDate: Mon Sep 25 09:37:01 2023 +0800

[SPARK-45257][CORE][FOLLOWUP] Correct the from version in migration guide

### What changes were proposed in this pull request?

Correct the from version in migration guide

### Why are the changes needed?

Address comments  
https://github.com/apache/spark/commit/8d599972872225e336467700715b1d4771624efe#r128053622

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Review

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43072 from pan3793/SPARK-45257-followup.

Authored-by: Cheng Pan 
Signed-off-by: Kent Yao 
---
 docs/core-migration-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
index 765c3494f66..2464d774240 100644
--- a/docs/core-migration-guide.md
+++ b/docs/core-migration-guide.md
@@ -22,7 +22,7 @@ license: |
 * Table of contents
 {:toc}
 
-## Upgrading from Core 3.4 to 4.0
+## Upgrading from Core 3.5 to 4.0
 
 - Since Spark 4.0, Spark will compress event logs. To restore the behavior 
before Spark 4.0, you can set `spark.eventLog.compress` to `false`.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-45240][SQL][CONNECT] Implement Error Enrichment for Python Client

2023-09-24 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 913991046c6 [SPARK-45240][SQL][CONNECT] Implement Error Enrichment for 
Python Client
913991046c6 is described below

commit 913991046c6d2b707eab64bd8ca874f9b9bb6581
Author: Yihong He 
AuthorDate: Mon Sep 25 09:35:06 2023 +0900

[SPARK-45240][SQL][CONNECT] Implement Error Enrichment for Python Client

### What changes were proposed in this pull request?

- Implemented the reconstruction of the exception with un-truncated error 
messages and full server-side stacktrace (includes cause exceptions) based on 
the responses of FetchErrorDetails RPC.

Examples:
`./bin/pyspark --remote local`
```python
>>> spark.sql("""select from_json('{"d": "02-29"}', 'd date',  
map('dateFormat', 'MM-dd'))""").collect()
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/session.py", line 
556, in sql
data, properties = 
self.client.execute_command(cmd.command(self._client))
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 958, in execute_command
data, _, _, _, properties = self._execute_and_fetch(req)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1259, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(req):
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1240, in _execute_and_fetch_as_iterator
self._handle_error(error)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1479, in _handle_error
self._handle_rpc_error(error)
  File 
"/Users/yihonghe/Workspace/spark/python/pyspark/sql/connect/client/core.py", 
line 1533, in _handle_rpc_error
raise convert_exception(
pyspark.errors.exceptions.connect.SparkUpgradeException: 
[INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get 
a different result due to the upgrading to Spark >= 3.0:
Fail to parse '02-29' in the new parser. You can set 
"spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before 
Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string.

JVM stacktrace:
org.apache.spark.SparkUpgradeException: 
[INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] You may get 
a different result due to the upgrading to Spark >= 3.0:
Fail to parse '02-29' in the new parser. You can set 
"spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before 
Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime string.
at 
org.apache.spark.sql.errors.ExecutionErrors.failToParseDateTimeInNewParserError(ExecutionErrors.scala:54)
at 
org.apache.spark.sql.errors.ExecutionErrors.failToParseDateTimeInNewParserError$(ExecutionErrors.scala:48)
at 
org.apache.spark.sql.errors.ExecutionErrors$.failToParseDateTimeInNewParserError(ExecutionErrors.scala:218)
at 
org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:142)
at 
org.apache.spark.sql.catalyst.util.DateTimeFormatterHelper$$anonfun$checkParsedDiff$1.applyOrElse(DateTimeFormatterHelper.scala:135)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:35)
at 
org.apache.spark.sql.catalyst.util.Iso8601DateFormatter.parse(DateFormatter.scala:59)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeConverter$11$1.applyOrElse(JacksonParser.scala:302)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeConverter$11$1.applyOrElse(JacksonParser.scala:299)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.parseJsonToken(JacksonParser.scala:404)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.$anonfun$makeConverter$11(JacksonParser.scala:299)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.org$apache$spark$sql$catalyst$json$JacksonParser$$convertObject(JacksonParser.scala:457)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeStructRootConverter$3$1.applyOrElse(JacksonParser.scala:123)
at 
org.apache.spark.sql.catalyst.json.JacksonParser$$anonfun$$nestedInanonfun$makeStructRootConverter$3$1.applyOrElse(JacksonParser.scala:122)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.parseJsonToken(JacksonParser.scala:404)
at 
org.apache.spark.sql.catalyst.json.JacksonParser.$anonfun$makeStructRootConverter$3(Jackson

[spark] branch master updated: [SPARK-45207][SQL][CONNECT] Implement Error Enrichment for Scala Client

2023-09-24 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4863be5632f [SPARK-45207][SQL][CONNECT] Implement Error Enrichment for 
Scala Client
4863be5632f is described below

commit 4863be5632f3165a5699a525235ea118c1e1f7eb
Author: Yihong He 
AuthorDate: Mon Sep 25 09:35:33 2023 +0900

[SPARK-45207][SQL][CONNECT] Implement Error Enrichment for Scala Client

### What changes were proposed in this pull request?

-  Implemented the reconstruction of the complete exception (un-truncated 
error messages, cause exceptions, server-side stacktrace) based on the 
responses of FetchErrorDetails RPC.

### Why are the changes needed?

- Cause exceptions play an important role in the current control flow, such 
as in StreamingQueryException. They are also valuable for debugging.
- Un-truncated error message is useful for debugging
- Providing server-side stack traces aids in effectively diagnosing 
server-related issues.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

- `build/sbt "connect-client-jvm/testOnly *ClientE2ETestSuite"`
- `build/sbt "connect-client-jvm/testOnly *ClientStreamingQuerySuite"`

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #42987 from heyihong/SPARK-45207.

Authored-by: Yihong He 
Signed-off-by: Hyukjin Kwon 
---
 .../org/apache/spark/sql/ClientE2ETestSuite.scala  |  59 ++-
 .../sql/streaming/ClientStreamingQuerySuite.scala  |  41 -
 .../client/CustomSparkConnectBlockingStub.scala|  44 -
 .../connect/client/GrpcExceptionConverter.scala| 192 +
 4 files changed, 292 insertions(+), 44 deletions(-)

diff --git 
a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
 
b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
index 21892542eab..ec9b1698a4e 100644
--- 
a/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
+++ 
b/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala
@@ -18,6 +18,7 @@ package org.apache.spark.sql
 
 import java.io.{ByteArrayOutputStream, PrintStream}
 import java.nio.file.Files
+import java.time.DateTimeException
 import java.util.Properties
 
 import scala.collection.JavaConverters._
@@ -29,7 +30,7 @@ import org.apache.commons.lang3.{JavaVersion, SystemUtils}
 import org.scalactic.TolerantNumerics
 import org.scalatest.PrivateMethodTester
 
-import org.apache.spark.{SparkArithmeticException, SparkException}
+import org.apache.spark.{SparkArithmeticException, SparkException, 
SparkUpgradeException}
 import org.apache.spark.SparkBuildInfo.{spark_version => SPARK_VERSION}
 import 
org.apache.spark.sql.catalyst.analysis.{NamespaceAlreadyExistsException, 
NoSuchDatabaseException, NoSuchTableException, TableAlreadyExistsException, 
TempTableAlreadyExistsException}
 import org.apache.spark.sql.catalyst.encoders.AgnosticEncoders.StringEncoder
@@ -44,6 +45,62 @@ import org.apache.spark.sql.types._
 
 class ClientE2ETestSuite extends RemoteSparkSession with SQLHelper with 
PrivateMethodTester {
 
+  for (enrichErrorEnabled <- Seq(false, true)) {
+test(s"cause exception - ${enrichErrorEnabled}") {
+  withSQLConf("spark.sql.connect.enrichError.enabled" -> 
enrichErrorEnabled.toString) {
+val ex = intercept[SparkUpgradeException] {
+  spark
+.sql("""
+|select from_json(
+|  '{"d": "02-29"}',
+|  'd date',
+|  map('dateFormat', 'MM-dd'))
+|""".stripMargin)
+.collect()
+}
+if (enrichErrorEnabled) {
+  assert(ex.getCause.isInstanceOf[DateTimeException])
+} else {
+  assert(ex.getCause == null)
+}
+  }
+}
+  }
+
+  test(s"throw SparkException with large cause exception") {
+withSQLConf("spark.sql.connect.enrichError.enabled" -> "true") {
+  val session = spark
+  import session.implicits._
+
+  val throwException =
+udf((_: String) => throw new SparkException("test" * 1))
+
+  val ex = intercept[SparkException] {
+Seq("1").toDS.withColumn("udf_val", throwException($"value")).collect()
+  }
+
+  assert(ex.getCause.isInstanceOf[SparkException])
+  assert(ex.getCause.getMessage.contains("test" * 1))
+}
+  }
+
+  for (isServerStackTraceEnabled <- Seq(false, true)) {
+test(s"server-side stack trace is set in exceptions - 
${isServerStackTraceEnabled}") {
+  withSQLConf(
+"spark.sql.connect.serverStacktrace.enabled" -> 
isServerStackTraceEnabled.toString,
+"spark.s

[spark] branch master updated: [SPARK-45279][PYTHON][CONNECT] Attach plan_id for all logical plans

2023-09-24 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 609552e19cf [SPARK-45279][PYTHON][CONNECT] Attach plan_id for all 
logical plans
609552e19cf is described below

commit 609552e19cfe75109b1b4641baadd79360e75443
Author: Ruifeng Zheng 
AuthorDate: Mon Sep 25 08:17:08 2023 +0800

[SPARK-45279][PYTHON][CONNECT] Attach plan_id for all logical plans

### What changes were proposed in this pull request?
Attach plan_id for all logical plans, except `CachedRelation`

### Why are the changes needed?
1, all logical plans should contain its plan id in protos
2, catalog plans also contain the plan id in scala client, e.g.


https://github.com/apache/spark/blob/05f5dccbd34218c7d399228529853bdb1595f3a2/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala#L63-L67

`newDataset` method will set the plan id

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
CI

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #43055 from zhengruifeng/connect_plan_id.

Authored-by: Ruifeng Zheng 
Signed-off-by: Ruifeng Zheng 
---
 python/pyspark/sql/connect/plan.py | 79 +++---
 1 file changed, 40 insertions(+), 39 deletions(-)

diff --git a/python/pyspark/sql/connect/plan.py 
b/python/pyspark/sql/connect/plan.py
index 219545cf646..6758b3673f3 100644
--- a/python/pyspark/sql/connect/plan.py
+++ b/python/pyspark/sql/connect/plan.py
@@ -1190,9 +1190,7 @@ class CollectMetrics(LogicalPlan):
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
 assert self._child is not None
-
-plan = proto.Relation()
-plan.common.plan_id = self._child._plan_id
+plan = self._create_proto_relation()
 plan.collect_metrics.input.CopyFrom(self._child.plan(session))
 plan.collect_metrics.name = self._name
 plan.collect_metrics.metrics.extend([self.col_to_expr(x, session) for 
x in self._exprs])
@@ -1689,7 +1687,9 @@ class CurrentDatabase(LogicalPlan):
 super().__init__(None)
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-return 
proto.Relation(catalog=proto.Catalog(current_database=proto.CurrentDatabase()))
+plan = self._create_proto_relation()
+plan.catalog.current_database.SetInParent()
+return plan
 
 
 class SetCurrentDatabase(LogicalPlan):
@@ -1698,7 +1698,7 @@ class SetCurrentDatabase(LogicalPlan):
 self._db_name = db_name
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = proto.Relation()
+plan = self._create_proto_relation()
 plan.catalog.set_current_database.db_name = self._db_name
 return plan
 
@@ -1709,7 +1709,8 @@ class ListDatabases(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_databases=proto.ListDatabases()))
+plan = self._create_proto_relation()
+plan.catalog.list_databases.SetInParent()
 if self._pattern is not None:
 plan.catalog.list_databases.pattern = self._pattern
 return plan
@@ -1722,7 +1723,8 @@ class ListTables(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_tables=proto.ListTables()))
+plan = self._create_proto_relation()
+plan.catalog.list_tables.SetInParent()
 if self._db_name is not None:
 plan.catalog.list_tables.db_name = self._db_name
 if self._pattern is not None:
@@ -1737,7 +1739,8 @@ class ListFunctions(LogicalPlan):
 self._pattern = pattern
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_functions=proto.ListFunctions()))
+plan = self._create_proto_relation()
+plan.catalog.list_functions.SetInParent()
 if self._db_name is not None:
 plan.catalog.list_functions.db_name = self._db_name
 if self._pattern is not None:
@@ -1752,7 +1755,7 @@ class ListColumns(LogicalPlan):
 self._db_name = db_name
 
 def plan(self, session: "SparkConnectClient") -> proto.Relation:
-plan = 
proto.Relation(catalog=proto.Catalog(list_columns=proto.ListColumns()))
+plan = self._create_proto_relation()
 plan.catalog.list_columns.table_name = self._table_name
 if self._db_name is not None:
 plan.catalog.list_columns.db_name = self._db_name
@@ -1765,7 +1768,7 @@ class GetDatabase(Logica

[spark] branch branch-3.3 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 9a28200f6e4 [SPARK-45286][DOCS] Add back Matomo analytics
9a28200f6e4 is described below

commit 9a28200f6e461c4929dd6e05b6dd55fe984c0924
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index d4463922766..2d139f5e0fb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -33,6 +33,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css"; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+

[spark] branch branch-3.4 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 20924aa581a [SPARK-45286][DOCS] Add back Matomo analytics
20924aa581a is described below

commit 20924aa581a2c5c49ec700689f1888dd7db79e6b
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index d4463922766..2d139f5e0fb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -33,6 +33,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css"; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+

[spark] branch branch-3.5 updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
 new 609306ff5da [SPARK-45286][DOCS] Add back Matomo analytics
609306ff5da is described below

commit 609306ff5daa8ff7c2212088d33c0911ad0f4989
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
(cherry picked from commit a881438114ea3e8e918d981ef89ed1ab956d6fca)
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index 9b7c4692461..8c4435fdf31 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -32,6 +32,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css"; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+

[spark] branch master updated: [SPARK-45286][DOCS] Add back Matomo analytics

2023-09-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a881438114e [SPARK-45286][DOCS] Add back Matomo analytics
a881438114e is described below

commit a881438114ea3e8e918d981ef89ed1ab956d6fca
Author: Sean Owen 
AuthorDate: Sun Sep 24 14:17:55 2023 -0500

[SPARK-45286][DOCS] Add back Matomo analytics

### What changes were proposed in this pull request?

Add analytics to doc pages using the ASF's Matomo service

### Why are the changes needed?

We had previously removed Google Analytics from the website and release 
docs, per ASF policy: https://github.com/apache/spark/pull/36310

We just restored analytics using the ASF-hosted Matomo service on the 
website:

https://github.com/apache/spark-website/commit/a1548627b48a62c2e51870d1488ca3e09397bd30

This change would put the same new tracking code back into the release 
docs. It would let us see what docs and resources are most used, I suppose.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43063 from srowen/SPARK-45286.

Authored-by: Sean Owen 
Signed-off-by: Sean Owen 
---
 docs/_layouts/global.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/_layouts/global.html b/docs/_layouts/global.html
index e857efad6f0..c2f05cfd6bb 100755
--- a/docs/_layouts/global.html
+++ b/docs/_layouts/global.html
@@ -32,6 +32,25 @@
 https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css"; />
 
 
+{% production %}
+
+
+var _paq = window._paq = window._paq || [];
+/* tracker methods like "setCustomDimension" should be called 
before "trackPageView" */
+_paq.push(["disableCookies"]);
+_paq.push(['trackPageView']);
+_paq.push(['enableLinkTracking']);
+(function() {
+  var u="https://analytics.apache.org/";;
+  _paq.push(['setTrackerUrl', u+'matomo.php']);
+  _paq.push(['setSiteId', '40']);
+  var d=document, g=d.createElement('script'), 
s=d.getElementsByTagName('script')[0];
+  g.async=true; g.src=u+'matomo.js'; 
s.parentNode.insertBefore(g,s);
+})();
+
+
+{% endproduction %}
+

[spark] branch branch-3.5 updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid

[spark] branch master updated: [SPARK-45291][SQL][REST] Use unknown query execution id instead of no such app when id is invalid

[spark] branch master updated: [SPARK-42617][PS] Support `isocalendar` from the pandas 2.0.0

[GitHub] [spark-website] panbingkun commented on pull request #474: [SPARK-44820][DOCS] Switch languages consistently across docs for all code snippets

[spark] branch master updated (f81f51467b8 -> bb0d287114f)

[spark] branch master updated: [SPARK-45257][CORE][FOLLOWUP] Correct the from version in migration guide

[spark] branch master updated: [SPARK-45240][SQL][CONNECT] Implement Error Enrichment for Python Client

[spark] branch master updated: [SPARK-45207][SQL][CONNECT] Implement Error Enrichment for Scala Client

[spark] branch master updated: [SPARK-45279][PYTHON][CONNECT] Attach plan_id for all logical plans

[spark] branch branch-3.3 updated: [SPARK-45286][DOCS] Add back Matomo analytics

[spark] branch branch-3.4 updated: [SPARK-45286][DOCS] Add back Matomo analytics

[spark] branch branch-3.5 updated: [SPARK-45286][DOCS] Add back Matomo analytics

[spark] branch master updated: [SPARK-45286][DOCS] Add back Matomo analytics

13 matches

Site Navigation

Mail list logo

Footer information