date:20230524

[spark] branch master updated (5ec13854620 -> 46949e692e8)

2023-05-24 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 5ec13854620 [SPARK-43334][UI] Fix error while serializing 
ExecutorPeakMetricsDistributions into API response
 add 46949e692e8 [SPARK-43545][SQL][PYTHON] Support nested timestamp type

No new revisions were added by this update.

Summary of changes:
 python/pyspark/pandas/typedef/typehints.py |   2 +-
 python/pyspark/sql/connect/client/core.py  |   2 +-
 python/pyspark/sql/connect/dataframe.py|   4 +-
 python/pyspark/sql/connect/types.py|  21 ++-
 python/pyspark/sql/pandas/conversion.py| 113 --
 python/pyspark/sql/pandas/types.py | 132 ++---
 .../pyspark/sql/tests/connect/test_parity_arrow.py |   6 +
 .../connect/test_parity_pandas_udf_grouped_agg.py  |   7 +-
 .../tests/connect/test_parity_pandas_udf_scalar.py |   5 -
 .../tests/connect/test_parity_pandas_udf_window.py |   2 +-
 .../sql/tests/pandas/test_pandas_cogrouped_map.py  |  16 +-
 .../sql/tests/pandas/test_pandas_grouped_map.py|  56 ---
 .../tests/pandas/test_pandas_udf_grouped_agg.py|  15 +-
 .../sql/tests/pandas/test_pandas_udf_scalar.py |  38 -
 .../sql/tests/pandas/test_pandas_udf_window.py |  13 +-
 python/pyspark/sql/tests/test_arrow.py | 164 ++---
 16 files changed, 455 insertions(+), 141 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43334][UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

2023-05-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5ec13854620 [SPARK-43334][UI] Fix error while serializing 
ExecutorPeakMetricsDistributions into API response
5ec13854620 is described below

commit 5ec138546205ba4248cc9ec72c3b7baf60f2fede
Author: Thejdeep Gudivada 
AuthorDate: Wed May 24 18:25:36 2023 -0500

[SPARK-43334][UI] Fix error while serializing 
ExecutorPeakMetricsDistributions into API response

When we calculate the quantile information from the peak executor metrics 
values for the distribution, there is a possibility of running into an 
`ArrayIndexOutOfBounds` exception when the metric values are empty. This PR 
addresses that and fixes it by returning an empty array if the values are empty.

 ### Why are the changes needed?
 Without these changes, when the withDetails query parameter is used to 
query the stages REST API, we encounter a partial JSON response since the peak 
executor metrics distribution cannot be serialized due to the above index error.

 ### Does this PR introduce _any_ user-facing change?
 No

 ### How was this patch tested?
 Added a unit test to test this behavior

Closes #41017 from thejdeep/SPARK-43334.

Authored-by: Thejdeep Gudivada 
Signed-off-by: Sean Owen 
---
 .../main/scala/org/apache/spark/status/AppStatusStore.scala  |  9 +
 .../main/scala/org/apache/spark/status/AppStatusUtils.scala  | 12 
 core/src/main/scala/org/apache/spark/status/api/v1/api.scala |  7 +++
 .../scala/org/apache/spark/status/AppStatusUtilsSuite.scala  | 11 +++
 4 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index d02d4b2507a..eaa7b7b9873 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -27,6 +27,7 @@ import scala.collection.mutable.HashMap
 import org.apache.spark.{JobExecutionStatus, SparkConf, SparkContext}
 import org.apache.spark.internal.Logging
 import org.apache.spark.internal.config.Status.LIVE_UI_LOCAL_STORE_DIR
+import org.apache.spark.status.AppStatusUtils.getQuantilesValue
 import org.apache.spark.status.api.v1
 import org.apache.spark.storage.FallbackStorage.FALLBACK_BLOCK_MANAGER_ID
 import org.apache.spark.ui.scope._
@@ -770,14 +771,6 @@ private[spark] class AppStatusStore(
 }
   }
 
-  def getQuantilesValue(
-values: IndexedSeq[Double],
-quantiles: Array[Double]): IndexedSeq[Double] = {
-val count = values.size
-val indices = quantiles.map { q => math.min((q * count).toLong, count - 1) 
}
-indices.map(i => values(i.toInt)).toIndexedSeq
-  }
-
   def rdd(rddId: Int): v1.RDDStorageInfo = {
 store.read(classOf[RDDStorageInfoWrapper], rddId).info
   }
diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala
index 87f434daf48..04918ccbd57 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusUtils.scala
@@ -72,4 +72,16 @@ private[spark] object AppStatusUtils {
   -1
 }
   }
+
+  def getQuantilesValue(
+values: IndexedSeq[Double],
+quantiles: Array[Double]): IndexedSeq[Double] = {
+val count = values.size
+if (count > 0) {
+  val indices = quantiles.map { q => math.min((q * count).toLong, count - 
1) }
+  indices.map(i => values(i.toInt)).toIndexedSeq
+} else {
+  IndexedSeq.fill(quantiles.length)(0.0)
+}
+  }
 }
diff --git a/core/src/main/scala/org/apache/spark/status/api/v1/api.scala 
b/core/src/main/scala/org/apache/spark/status/api/v1/api.scala
index e272cf04dc7..f436d16ca47 100644
--- a/core/src/main/scala/org/apache/spark/status/api/v1/api.scala
+++ b/core/src/main/scala/org/apache/spark/status/api/v1/api.scala
@@ -31,6 +31,7 @@ import org.apache.spark.JobExecutionStatus
 import org.apache.spark.executor.ExecutorMetrics
 import org.apache.spark.metrics.ExecutorMetricType
 import org.apache.spark.resource.{ExecutorResourceRequest, 
ResourceInformation, TaskResourceRequest}
+import org.apache.spark.status.AppStatusUtils.getQuantilesValue
 
 case class ApplicationInfo private[spark](
 id: String,
@@ -454,13 +455,11 @@ class ExecutorMetricsDistributions private[spark](
 class ExecutorPeakMetricsDistributions private[spark](
   val quantiles: IndexedSeq[Double],
   val executorMetrics: IndexedSeq[ExecutorMetrics]) {
-  private lazy val count = executorMetrics.length
-  private lazy val indices = quantiles.map { q => math.min((q * count).toLong, 
count - 1) }
 
   /** Returns the distributions f

[spark] branch master updated (1c6b5382051 -> f2b4ff2769b)

2023-05-24 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 1c6b5382051 [SPARK-43771][BUILD][CONNECT] Upgrade mima-core from 1.1.0 
to 1.1.2
 add f2b4ff2769b [SPARK-43573][BUILD] Make SparkBuilder could config the 
heap size of test JVM

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43771][BUILD][CONNECT] Upgrade mima-core from 1.1.0 to 1.1.2

2023-05-24 Thread yangjie01

This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1c6b5382051 [SPARK-43771][BUILD][CONNECT] Upgrade mima-core from 1.1.0 
to 1.1.2
1c6b5382051 is described below

commit 1c6b538205143223d25cf44f3d8c483ae8161587
Author: panbingkun 
AuthorDate: Wed May 24 23:01:18 2023 +0800

[SPARK-43771][BUILD][CONNECT] Upgrade mima-core from 1.1.0 to 1.1.2

### What changes were proposed in this pull request?
The pr aims to upgrade mima-core from 1.1.0 to 1.1.2.

### Why are the changes needed?
- New version includes bug fixed, eg:
1.Handle POM-only modules by creating empty Definitions by rossabaker in 
https://github.com/lightbend/mima/pull/743

- Release note:
1.https://github.com/lightbend/mima/releases/tag/1.1.2
2.https://github.com/lightbend/mima/releases/tag/1.1.1

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #41294 from panbingkun/SPARK-43771.

Authored-by: panbingkun 
Signed-off-by: yangjie01 
---
 connector/connect/client/jvm/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/connector/connect/client/jvm/pom.xml 
b/connector/connect/client/jvm/pom.xml
index 413764d0ea2..4d0a4379329 100644
--- a/connector/connect/client/jvm/pom.xml
+++ b/connector/connect/client/jvm/pom.xml
@@ -34,7 +34,7 @@
 connect-client-jvm
 31.0.1-jre
 1.0.1
-1.1.0
+1.1.2
   
 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0

2023-05-24 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ee27e85d6ab [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0
ee27e85d6ab is described below

commit ee27e85d6abdf2fbf97d5a419286c8cdf8177604
Author: panbingkun 
AuthorDate: Wed May 24 08:00:13 2023 -0700

[SPARK-43739][BUILD] Upgrade commons-io to 2.12.0

### What changes were proposed in this pull request?
The pr aims to upgrade common-io from 2.11.0 to 2.12.0.

### Why are the changes needed?
common-io new version includes some improvement & bug fixed, eg
- https://github.com/apache/commons-io/pull/450
- https://github.com/apache/commons-io/pull/368
- [Add 
PathUtils.touch(Path)](https://github.com/apache/commons-io/commit/fd7c8182d2117d01f43ccc9fe939105f834ba672)
 The error exception of the FileUtils.touch method has been changed from 
`java.io.FileNotFoundException` to `java.nio.file.NoSuchFileException`
- common-io 2.11.0 VS 2.12.0

https://github.com/apache/commons-io/compare/rel/commons-io-2.11.0...rel/commons-io-2.12.0

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #41271 from panbingkun/SPARK-43739.

Authored-by: panbingkun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +-
 pom.xml   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 553776acfb1..fa870c7240f 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -43,7 +43,7 @@ commons-compiler/3.1.9//commons-compiler-3.1.9.jar
 commons-compress/1.23.0//commons-compress-1.23.0.jar
 commons-crypto/1.1.0//commons-crypto-1.1.0.jar
 commons-dbcp/1.4//commons-dbcp-1.4.jar
-commons-io/2.11.0//commons-io-2.11.0.jar
+commons-io/2.12.0//commons-io-2.12.0.jar
 commons-lang/2.6//commons-lang-2.6.jar
 commons-lang3/3.12.0//commons-lang3-3.12.0.jar
 commons-logging/1.1.3//commons-logging-1.1.3.jar
diff --git a/pom.xml b/pom.xml
index 7defd251865..6fe9b7b8701 100644
--- a/pom.xml
+++ b/pom.xml
@@ -185,7 +185,7 @@
 3.0.3
 1.15
 1.23.0
-2.11.0
+2.12.0
 
 2.6
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-38464][CORE] Use error classes in org.apache.spark.io

2023-05-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 76f82bd8c54 [SPARK-38464][CORE] Use error classes in 
org.apache.spark.io
76f82bd8c54 is described below

commit 76f82bd8c54352a0b38c3e1d8de5b24627446b9c
Author: Bo Zhang 
AuthorDate: Wed May 24 14:21:42 2023 +0300

[SPARK-38464][CORE] Use error classes in org.apache.spark.io

### What changes were proposed in this pull request?
This PR aims to change exceptions created in package org.apache.spark.io to 
use error class.

This PR also adds `toConf` and `toConfVal` in `SparkCoreErrors`.

### Why are the changes needed?
This is to move exceptions created in package org.apache.spark.io to error 
class.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Updated existing tests.

Closes #41277 from bozhang2820/spark-38464.

Authored-by: Bo Zhang 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 10 ++
 .../scala/org/apache/spark/errors/SparkCoreErrors.scala  | 12 
 .../scala/org/apache/spark/io/CompressionCodec.scala | 15 +++
 .../org/apache/spark/io/CompressionCodecSuite.scala  | 16 
 4 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index fcb9ec249db..1b75f89cc10 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -187,6 +187,16 @@
 ],
 "sqlState" : "22003"
   },
+  "CODEC_NOT_AVAILABLE" : {
+"message" : [
+  "The codec  is not available. Consider to set the config 
 to ."
+]
+  },
+  "CODEC_SHORT_NAME_NOT_FOUND" : {
+"message" : [
+  "Cannot find a short name for the codec ."
+]
+  },
   "COLUMN_ALIASES_IS_NOT_ALLOWED" : {
 "message" : [
   "Columns aliases are not allowed in ."
diff --git a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala 
b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
index 8abb2564328..f8e7f2db259 100644
--- a/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
+++ b/core/src/main/scala/org/apache/spark/errors/SparkCoreErrors.scala
@@ -466,4 +466,16 @@ private[spark] object SparkCoreErrors {
 "requestedBytes" -> requestedBytes.toString,
 "receivedBytes" -> receivedBytes.toString).asJava)
   }
+
+  private def quoteByDefault(elem: String): String = {
+"\"" + elem + "\""
+  }
+
+  def toConf(conf: String): String = {
+quoteByDefault(conf)
+  }
+
+  def toConfVal(conf: String): String = {
+quoteByDefault(conf)
+  }
 }
diff --git a/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala 
b/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
index eb3dc938d4d..0bb392deb39 100644
--- a/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
+++ b/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala
@@ -26,8 +26,9 @@ import net.jpountz.lz4.{LZ4BlockInputStream, 
LZ4BlockOutputStream, LZ4Factory}
 import net.jpountz.xxhash.XXHashFactory
 import org.xerial.snappy.{Snappy, SnappyInputStream, SnappyOutputStream}
 
-import org.apache.spark.SparkConf
+import org.apache.spark.{SparkConf, SparkIllegalArgumentException}
 import org.apache.spark.annotation.DeveloperApi
+import org.apache.spark.errors.SparkCoreErrors.{toConf, toConfVal}
 import org.apache.spark.internal.config._
 import org.apache.spark.util.Utils
 
@@ -88,8 +89,12 @@ private[spark] object CompressionCodec {
 } catch {
   case _: ClassNotFoundException | _: IllegalArgumentException => None
 }
-codec.getOrElse(throw new IllegalArgumentException(s"Codec [$codecName] is 
not available. " +
-  s"Consider setting $configKey=$FALLBACK_COMPRESSION_CODEC"))
+codec.getOrElse(throw new SparkIllegalArgumentException(
+  errorClass = "CODEC_NOT_AVAILABLE",
+  messageParameters = Map(
+"codecName" -> codecName,
+"configKey" -> toConf(configKey),
+"configVal" -> toConfVal(FALLBACK_COMPRESSION_CODEC
   }
 
   /**
@@ -102,7 +107,9 @@ private[spark] object CompressionCodec {
 } else {
   shortCompressionCodecNames
 .collectFirst { case (k, v) if v == codecName => k }
-.getOrElse { throw new IllegalArgumentException(s"No short name for 
codec $codecName.") }
+.getOrElse { throw new SparkIllegalArgumentException(
+  errorClass = "CODEC_SHORT_NAME_NOT_FOUND",
+  messageParameters = Map("codecName" -> codecName))}
 }
   }
 
diff --git 
a/core/src/test/scala/org/apache/spark/io/CompressionCodecSuite.scala 
b/core/src/test/scala/org/apache/spark/io/CompressionCodecSu

[spark] branch master updated (7c7b9585a2a -> 0e8e4ae47fb)

2023-05-24 Thread weichenxu123

This is an automated email from the ASF dual-hosted git repository.

weichenxu123 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 7c7b9585a2a [SPARK-43546][PYTHON][CONNECT][TESTS] Complete parity 
tests of Pandas UDF
 add 0e8e4ae47fb [SPARK-43516][ML][PYTHON][CONNECT] Base interfaces of 
sparkML for spark3.5: estimator/transformer/model/evaluator

No new revisions were added by this update.

Summary of changes:
 dev/infra/Dockerfile   |   2 +-
 dev/requirements.txt   |   2 +
 dev/sparktestsupport/modules.py|   6 +
 python/mypy.ini|   3 +
 python/pyspark/mlv2/__init__.py|  36 
 python/pyspark/mlv2/base.py| 240 +
 python/pyspark/mlv2/evaluation.py  |  88 
 python/pyspark/mlv2/feature.py | 124 +++
 python/pyspark/mlv2/summarizer.py  | 118 ++
 .../mlv2/tests/connect/test_parity_evaluation.py   |  47 
 .../mlv2/tests/connect/test_parity_feature.py  |  40 
 .../mlv2/tests/connect/test_parity_summarizer.py   |  40 
 python/pyspark/mlv2/tests/test_evaluation.py   |  88 
 python/pyspark/mlv2/tests/test_feature.py  |  98 +
 python/pyspark/mlv2/tests/test_summarizer.py   |  80 +++
 python/pyspark/mlv2/util.py| 192 +
 16 files changed, 1203 insertions(+), 1 deletion(-)
 create mode 100644 python/pyspark/mlv2/__init__.py
 create mode 100644 python/pyspark/mlv2/base.py
 create mode 100644 python/pyspark/mlv2/evaluation.py
 create mode 100644 python/pyspark/mlv2/feature.py
 create mode 100644 python/pyspark/mlv2/summarizer.py
 create mode 100644 python/pyspark/mlv2/tests/connect/test_parity_evaluation.py
 create mode 100644 python/pyspark/mlv2/tests/connect/test_parity_feature.py
 create mode 100644 python/pyspark/mlv2/tests/connect/test_parity_summarizer.py
 create mode 100644 python/pyspark/mlv2/tests/test_evaluation.py
 create mode 100644 python/pyspark/mlv2/tests/test_feature.py
 create mode 100644 python/pyspark/mlv2/tests/test_summarizer.py
 create mode 100644 python/pyspark/mlv2/util.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43546][PYTHON][CONNECT][TESTS] Complete parity tests of Pandas UDF

2023-05-24 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7c7b9585a2a [SPARK-43546][PYTHON][CONNECT][TESTS] Complete parity 
tests of Pandas UDF
7c7b9585a2a is described below

commit 7c7b9585a2aba7bbd52c197b07ed0181ae049c75
Author: Xinrong Meng 
AuthorDate: Wed May 24 15:54:18 2023 +0800

[SPARK-43546][PYTHON][CONNECT][TESTS] Complete parity tests of Pandas UDF

### What changes were proposed in this pull request?
Complete parity tests of Pandas UDF.

Specifically, parity tests are added referencing

```
test_pandas_udf_grouped_agg.py
test_pandas_udf_scalar.py
test_pandas_udf_window.py
```

### Why are the changes needed?
Parity with vanilla PySpark.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Unit tests.

Closes #41268 from xinrong-meng/more_parity.

Authored-by: Xinrong Meng 
Signed-off-by: Ruifeng Zheng 
---
 dev/sparktestsupport/modules.py|   3 +
 .../connect/test_parity_pandas_udf_grouped_agg.py  |  53 
 .../tests/connect/test_parity_pandas_udf_scalar.py |  69 ++
 .../tests/connect/test_parity_pandas_udf_window.py |  40 ++
 .../tests/pandas/test_pandas_udf_grouped_agg.py| 126 +-
 .../sql/tests/pandas/test_pandas_udf_scalar.py | 146 -
 .../sql/tests/pandas/test_pandas_udf_window.py |   6 +-
 7 files changed, 316 insertions(+), 127 deletions(-)

diff --git a/dev/sparktestsupport/modules.py b/dev/sparktestsupport/modules.py
index e68a83643ff..a95d2425136 100644
--- a/dev/sparktestsupport/modules.py
+++ b/dev/sparktestsupport/modules.py
@@ -782,6 +782,9 @@ pyspark_connect = Module(
 "pyspark.sql.tests.connect.streaming.test_parity_streaming",
 "pyspark.sql.tests.connect.streaming.test_parity_foreach",
 "pyspark.sql.tests.connect.test_parity_pandas_grouped_map_with_state",
+"pyspark.sql.tests.connect.test_parity_pandas_udf_scalar",
+"pyspark.sql.tests.connect.test_parity_pandas_udf_grouped_agg",
+"pyspark.sql.tests.connect.test_parity_pandas_udf_window",
 # ml doctests
 "pyspark.ml.connect.functions",
 # ml unittests
diff --git 
a/python/pyspark/sql/tests/connect/test_parity_pandas_udf_grouped_agg.py 
b/python/pyspark/sql/tests/connect/test_parity_pandas_udf_grouped_agg.py
new file mode 100644
index 000..25914a4b5b5
--- /dev/null
+++ b/python/pyspark/sql/tests/connect/test_parity_pandas_udf_grouped_agg.py
@@ -0,0 +1,53 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+import unittest
+
+from pyspark.sql.tests.pandas.test_pandas_udf_grouped_agg import 
GroupedAggPandasUDFTestsMixin
+from pyspark.testing.connectutils import ReusedConnectTestCase
+
+
+class PandasUDFGroupedAggParityTests(GroupedAggPandasUDFTestsMixin, 
ReusedConnectTestCase):
+def test_unsupported_types(self):
+self.check_unsupported_types()
+
+def test_invalid_args(self):
+self.check_invalid_args()
+
+@unittest.skip("Spark Connect doesn't support RDD but the test depends on 
it.")
+def test_grouped_with_empty_partition(self):
+super().test_grouped_with_empty_partition()
+
+# TODO(SPARK-43727): Parity returnType check in Spark Connect
+@unittest.skip("Fails in Spark Connect, should enable.")
+def check_unsupported_types(self):
+super().check_unsupported_types()
+
+@unittest.skip("Spark Connect does not support convert UNPARSED to 
catalyst types.")
+def test_manual(self):
+super().test_manual()
+
+
+if __name__ == "__main__":
+from pyspark.sql.tests.connect.test_parity_pandas_udf_grouped_agg import * 
 # noqa: F401
+
+try:
+import xmlrunner  # type: ignore[import]
+
+testRunner = xmlrunner.XMLTestRunner(output="target/test-reports", 
verbosity=2)
+except ImportError:
+testRunner = None
+unittest.main(testRunner=testRunner, verbosity=2)
diff --git a/python/pyspark/sql/tests/connect/test_

[spark] branch master updated: [SPARK-43741][BUILD] Upgrade maven-checkstyle-plugin from 3.2.2 to 3.3.0

2023-05-24 Thread yangjie01

This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 710b54c0108 [SPARK-43741][BUILD] Upgrade maven-checkstyle-plugin from 
3.2.2 to 3.3.0
710b54c0108 is described below

commit 710b54c01089a5e5d6bc395e2660ff45ca7e9851
Author: panbingkun 
AuthorDate: Wed May 24 15:45:58 2023 +0800

[SPARK-43741][BUILD] Upgrade maven-checkstyle-plugin from 3.2.2 to 3.3.0

### What changes were proposed in this pull request?
This PR aims to update maven-checkstyle-plugin from 3.2.2 to 3.3.0.

### Why are the changes needed?
- v3.2.2 VS v3.3.0 
https://github.com/apache/maven-checkstyle-plugin/compare/maven-checkstyle-plugin-3.2.2...maven-checkstyle-plugin-3.3.0
- This version relies on commons-lang3 for compilation, and Spark is 
currently using this version as well

https://github.com/apache/maven-checkstyle-plugin/compare/maven-checkstyle-plugin-3.2.2...maven-checkstyle-plugin-3.3.0#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8L199-L202

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Manual testing by:
./build/mvn -Pkinesis-asl -Pmesos -Pkubernetes -Pyarn -Phive 
-Phive-thriftserver -am checkstyle:checkstyle
- Pass GA.

Closes #41275 from panbingkun/SPARK-43741.

Authored-by: panbingkun 
Signed-off-by: yangjie01 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 810cebb7e32..7defd251865 100644
--- a/pom.xml
+++ b/pom.xml
@@ -3287,7 +3287,7 @@
   
 org.apache.maven.plugins
 maven-checkstyle-plugin
-3.2.2
+3.3.0
 
   false
   true


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5ec13854620 -> 46949e692e8)

[spark] branch master updated: [SPARK-43334][UI] Fix error while serializing ExecutorPeakMetricsDistributions into API response

[spark] branch master updated (1c6b5382051 -> f2b4ff2769b)

[spark] branch master updated: [SPARK-43771][BUILD][CONNECT] Upgrade mima-core from 1.1.0 to 1.1.2

[spark] branch master updated: [SPARK-43739][BUILD] Upgrade commons-io to 2.12.0

[spark] branch master updated: [SPARK-38464][CORE] Use error classes in org.apache.spark.io

[spark] branch master updated (7c7b9585a2a -> 0e8e4ae47fb)

[spark] branch master updated: [SPARK-43546][PYTHON][CONNECT][TESTS] Complete parity tests of Pandas UDF

[spark] branch master updated: [SPARK-43741][BUILD] Upgrade maven-checkstyle-plugin from 3.2.2 to 3.3.0

9 matches

Site Navigation

Mail list logo

Footer information