(spark) branch master updated: [SPARK-46436][INFRA] Clean up compatibility configurations related to branch-3.3 daily testing in `build_and_test.yml`

2023-12-17 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 78c2c2674b46 [SPARK-46436][INFRA] Clean up compatibility 
configurations related to branch-3.3 daily testing in `build_and_test.yml`
78c2c2674b46 is described below

commit 78c2c2674b46fcc60e781919550bf1735afc1b85
Author: yangjie01 
AuthorDate: Mon Dec 18 14:07:02 2023 +0800

[SPARK-46436][INFRA] Clean up compatibility configurations related to 
branch-3.3 daily testing in `build_and_test.yml`

### What changes were proposed in this pull request?
This pr aims to clean up compatibility configurations related to branch-3.3 
daily testing in `build_and_test.yml` Since Apache Spark 3.3 has reached EOL, 
there is no need for daily testing anymore.

### Why are the changes needed?
Apache Spark 3.3 has reached EOL, there is no need for daily testing 
anymore.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44392 from LuciferYang/3-3-daily.

Authored-by: yangjie01 
Signed-off-by: Ruifeng Zheng 
---
 .github/workflows/build_and_test.yml | 33 ++---
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 27d3c86686bb..bdcb1dd1ea5c 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -57,11 +57,7 @@ jobs:
   GITHUB_PREV_SHA: ${{ github.event.before }}
 outputs:
   required: ${{ steps.set-outputs.outputs.required }}
-  image_url: >-
-${{
-  (inputs.branch == 'branch-3.3' && 
'dongjoon/apache-spark-github-action-image:20220207')
-  || steps.infra-image-outputs.outputs.image_url
-}}
+  image_url: ${{ steps.infra-image-outputs.outputs.image_url }}
 steps:
 - name: Checkout Spark repository
   uses: actions/checkout@v4
@@ -292,10 +288,9 @@ jobs:
 needs: precondition
 # Currently, enable docker build from cache for `master` and branch (since 
3.4) jobs
 if: >-
-  (fromJson(needs.precondition.outputs.required).pyspark == 'true' ||
+  fromJson(needs.precondition.outputs.required).pyspark == 'true' ||
   fromJson(needs.precondition.outputs.required).lint == 'true' ||
-  fromJson(needs.precondition.outputs.required).sparkr == 'true') &&
-  (inputs.branch != 'branch-3.3')
+  fromJson(needs.precondition.outputs.required).sparkr == 'true'
 runs-on: ubuntu-latest
 permissions:
   packages: write
@@ -684,15 +679,7 @@ jobs:
 - name: Java linter
   run: ./dev/lint-java
 - name: Spark connect jvm client mima check
-  if: inputs.branch != 'branch-3.3'
   run: ./dev/connect-jvm-client-mima-check
-- name: Install Python linter dependencies for branch-3.3
-  if: inputs.branch == 'branch-3.3'
-  run: |
-# SPARK-44554: Copy from 
https://github.com/apache/spark/blob/073d0b60d31bf68ebacdc005f59b928a5902670f/.github/workflows/build_and_test.yml#L501-L508
-# Should delete this section after SPARK 3.3 EOL.
-python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 
'mypy==0.920' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc 
'jinja2<3.0.0' 'black==21.12b0'
-python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython
 - name: Install Python linter dependencies for branch-3.4
   if: inputs.branch == 'branch-3.4'
   run: |
@@ -708,7 +695,7 @@ jobs:
 python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 
'mypy==0.982' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc 
'jinja2<3.0.0' 'black==22.6.0'
 python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython 
'grpcio==1.59.3' 'grpc-stubs==1.24.11' 'googleapis-common-protos-stubs==2.2.0'
 - name: Install Python linter dependencies
-  if: inputs.branch != 'branch-3.3' && inputs.branch != 'branch-3.4' && 
inputs.branch != 'branch-3.5'
+  if: inputs.branch != 'branch-3.4' && inputs.branch != 'branch-3.5'
   run: |
 python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 
'mypy==0.982' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc jinja2 
'black==23.9.1'
 python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython 
'grpcio==1.59.3' 'grpc-stubs==1.24.11' 'googleapis-common-protos-stubs==2.2.0'
@@ -729,16 +716,16 @@ jobs:
   if: inputs.branch == 'branch-3.5'
   run: if test -f ./dev/connect-check-protos.py; then 
PATH=$PATH:$HOME/buf/bin PYTHON_EXECUTABLE=python3.9 
./dev/connect-check-protos.py; fi
 # Should delete this section after SPARK 3.5 EOL.
-- name: Install JavaScript linter dependencies 

(spark) branch master updated: [SPARK-46435][INFRA] Exclude `branch-3.3` from `publish_snapshot.yml`

2023-12-17 Thread yangjie01
This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 51dd7268f895 [SPARK-46435][INFRA] Exclude `branch-3.3` from 
`publish_snapshot.yml`
51dd7268f895 is described below

commit 51dd7268f895032a986a163318ae4fd31891a8a3
Author: Dongjoon Hyun 
AuthorDate: Mon Dec 18 13:53:10 2023 +0800

[SPARK-46435][INFRA] Exclude `branch-3.3` from `publish_snapshot.yml`

### What changes were proposed in this pull request?

This PR aims to exclude `branch-3.3` from `publish_snapshot.yml`.

### Why are the changes needed?

We don't need to publish the snapshot of branch-3.3 because it reached the 
End-Of-Life status.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44390 from dongjoon-hyun/SPARK-46435.

Authored-by: Dongjoon Hyun 
Signed-off-by: yangjie01 
---
 .github/workflows/publish_snapshot.yml | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/publish_snapshot.yml 
b/.github/workflows/publish_snapshot.yml
index ff51887e0df2..735c0f3d180b 100644
--- a/.github/workflows/publish_snapshot.yml
+++ b/.github/workflows/publish_snapshot.yml
@@ -28,7 +28,7 @@ on:
 description: 'list of branches to publish (JSON)'
 required: true
 # keep in sync with default value of strategy matrix 'branch'
-default: '["master", "branch-3.5", "branch-3.4", "branch-3.3"]'
+default: '["master", "branch-3.5", "branch-3.4"]'
 
 jobs:
   publish-snapshot:
@@ -38,7 +38,7 @@ jobs:
   fail-fast: false
   matrix:
 # keep in sync with default value of workflow_dispatch input 'branch'
-branch: ${{ fromJSON( inputs.branch || '["master", "branch-3.5", 
"branch-3.4", "branch-3.3"]' ) }}
+branch: ${{ fromJSON( inputs.branch || '["master", "branch-3.5", 
"branch-3.4"]' ) }}
 steps:
 - name: Checkout Spark repository
   uses: actions/checkout@v4
@@ -52,13 +52,13 @@ jobs:
 restore-keys: |
   snapshot-maven-
 - name: Install Java 8 for branch-3.x
-  if: matrix.branch == 'branch-3.5' || matrix.branch == 'branch-3.4' || 
matrix.branch == 'branch-3.3'
+  if: matrix.branch == 'branch-3.5' || matrix.branch == 'branch-3.4'
   uses: actions/setup-java@v4
   with:
 distribution: temurin
 java-version: 8
 - name: Install Java 17
-  if: matrix.branch != 'branch-3.5' && matrix.branch != 'branch-3.4' && 
matrix.branch != 'branch-3.3'
+  if: matrix.branch != 'branch-3.5' && matrix.branch != 'branch-3.4'
   uses: actions/setup-java@v4
   with:
 distribution: temurin


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated (5edd0ec696d6 -> 1b413a4bd61e)

2023-12-17 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 5edd0ec696d6 [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8
 add 1b413a4bd61e [SPARK-46433][PS][TESTS] Reorganize 
`OpsOnDiffFramesGroupByTests`: Factor out `test_aggregate/apply/cum*`

No new revisions were added by this update.

Summary of changes:
 dev/sparktestsupport/modules.py|   6 +
 ...orrwith.py => test_parity_groupby_aggregate.py} |   8 +-
 ..._parity_cov.py => test_parity_groupby_apply.py} |   9 +-
 ...rrwith.py => test_parity_groupby_cumulative.py} |   8 +-
 .../diff_frames_ops/test_groupby_aggregate.py  | 139 
 .../tests/diff_frames_ops/test_groupby_apply.py|  82 +++
 .../diff_frames_ops/test_groupby_cumulative.py | 182 
 .../tests/test_ops_on_diff_frames_groupby.py   | 235 -
 8 files changed, 421 insertions(+), 248 deletions(-)
 copy 
python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_corrwith.py => 
test_parity_groupby_aggregate.py} (87%)
 copy python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_cov.py 
=> test_parity_groupby_apply.py} (88%)
 copy 
python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_corrwith.py => 
test_parity_groupby_cumulative.py} (87%)
 create mode 100644 
python/pyspark/pandas/tests/diff_frames_ops/test_groupby_aggregate.py
 create mode 100644 
python/pyspark/pandas/tests/diff_frames_ops/test_groupby_apply.py
 create mode 100644 
python/pyspark/pandas/tests/diff_frames_ops/test_groupby_cumulative.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated: [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8

2023-12-17 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5edd0ec696d6 [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8
5edd0ec696d6 is described below

commit 5edd0ec696d6eb02a4bd8bbce086f0ad2d4b38d6
Author: Dongjoon Hyun 
AuthorDate: Mon Dec 18 10:02:34 2023 +0800

[SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8

### What changes were proposed in this pull request?

This PR aims to install `lxml` in PyPy3.8.

### Why are the changes needed?

#44247 seems to break the `PyPy3.8` daily CI.
- https://github.com/apache/spark/actions/runs/7239278196/job/19721018105
```
Traceback (most recent call last):
  File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 197, in 
_run_module_as_main
return _run_code(code, main_globals, None,
  File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
  File "/__w/spark/spark/python/pyspark/sql/tests/test_session.py", line 
22, in 
from lxml import etree
ModuleNotFoundError: No module named 'lxml'
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44391 from dongjoon-hyun/SPARK-46317.

Authored-by: Dongjoon Hyun 
Signed-off-by: Ruifeng Zheng 
---
 dev/infra/Dockerfile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile
index cade845d911b..9e81accbd3b7 100644
--- a/dev/infra/Dockerfile
+++ b/dev/infra/Dockerfile
@@ -92,7 +92,7 @@ RUN mkdir -p /usr/local/pypy/pypy3.8 && \
 ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3.8 && \
 ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3
 RUN curl -sS https://bootstrap.pypa.io/get-pip.py | pypy3
-RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.1.4' scipy coverage 
matplotlib
+RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.1.4' scipy coverage 
matplotlib lxml
 
 
 ARG BASIC_PIP_PKGS="numpy pyarrow>=14.0.0 six==1.16.0 pandas<=2.1.4 scipy 
unittest-xml-reporting plotly>=4.8 mlflow>=2.8.1 coverage matplotlib openpyxl 
memory-profiler>=0.61.0 scikit-learn>=1.3.2"


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated: [SPARK-46434][INFRA] Remove `build_branch33.yml`

2023-12-17 Thread yao
This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new dff0d10d7acd [SPARK-46434][INFRA] Remove `build_branch33.yml`
dff0d10d7acd is described below

commit dff0d10d7acd7e767c6900499d8342c3d9d2777b
Author: Dongjoon Hyun 
AuthorDate: Mon Dec 18 09:51:14 2023 +0800

[SPARK-46434][INFRA] Remove `build_branch33.yml`

### What changes were proposed in this pull request?

This PR aims to remove the daily CI on `branch-3.3`.

### Why are the changes needed?

`branch-3.3` reached the End-Of-Life.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual review.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #44389 from dongjoon-hyun/SPARK-46434.

Authored-by: Dongjoon Hyun 
Signed-off-by: Kent Yao 
---
 .github/workflows/build_branch33.yml | 51 
 1 file changed, 51 deletions(-)

diff --git a/.github/workflows/build_branch33.yml 
b/.github/workflows/build_branch33.yml
deleted file mode 100644
index fc6ce7028fc9..
--- a/.github/workflows/build_branch33.yml
+++ /dev/null
@@ -1,51 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the License is distributed on an
-# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-# KIND, either express or implied.  See the License for the
-# specific language governing permissions and limitations
-# under the License.
-#
-
-name: "Build (branch-3.3, Scala 2.13, Hadoop 3, JDK 8)"
-
-on:
-  schedule:
-- cron: '0 7 * * *'
-
-jobs:
-  run-build:
-permissions:
-  packages: write
-name: Run
-uses: ./.github/workflows/build_and_test.yml
-if: github.repository == 'apache/spark'
-with:
-  java: 8
-  branch: branch-3.3
-  hadoop: hadoop3
-  envs: >-
-{
-  "SCALA_PROFILE": "scala2.13",
-  "PYTHON_TO_TEST": "",
-  "ORACLE_DOCKER_IMAGE_NAME": "gvenzl/oracle-xe:21.3.0"
-}
-  jobs: >-
-{
-  "build": "true",
-  "pyspark": "true",
-  "sparkr": "true",
-  "tpcds-1g": "true",
-  "docker-integration-tests": "true",
-  "lint" : "true"
-}


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated (96bf373e9002 -> d163b319e21a)

2023-12-17 Thread kabhwan
This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 96bf373e9002 [SPARK-46430][CONNECT][TESTS] Add test for 
`ProtoUtils.abbreviate`
 add d163b319e21a [SPARK-46431][SS] Convert `IllegalStateException` to 
`internalError` in session iterators

No new revisions were added by this update.

Summary of changes:
 .../aggregate/MergingSessionsIterator.scala|  5 +-
 .../aggregate/UpdatingSessionsIterator.scala   |  5 +-
 .../streaming/MergingSessionsIteratorSuite.scala   | 30 +++
 .../streaming/UpdatingSessionsIteratorSuite.scala  | 58 ++
 4 files changed, 64 insertions(+), 34 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated: [SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate`

2023-12-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 96bf373e9002 [SPARK-46430][CONNECT][TESTS] Add test for 
`ProtoUtils.abbreviate`
96bf373e9002 is described below

commit 96bf373e90026dac8ef5020fe3032107c11df73f
Author: Ruifeng Zheng 
AuthorDate: Sun Dec 17 13:27:09 2023 -0800

[SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate`

### What changes were proposed in this pull request?
Add test for `ProtoUtils.abbreviate`

### Why are the changes needed?
`ProtoUtils.abbreviate` is not tested, for better test coverage
we are going to improve this functionality, before that we should protect 
its behavior.

### Does this PR introduce _any_ user-facing change?
no, test-only

### How was this patch tested?
added ut

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #44383 from zhengruifeng/proto_utils_test.

Authored-by: Ruifeng Zheng 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/connect/messages/AbbreviateSuite.scala | 121 +
 1 file changed, 121 insertions(+)

diff --git 
a/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala
 
b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala
new file mode 100644
index ..9a712e9b7bf1
--- /dev/null
+++ 
b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.messages
+
+import scala.jdk.CollectionConverters._
+
+import com.google.protobuf.ByteString
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.connect.proto
+import org.apache.spark.sql.connect.common.{ProtoDataTypes, ProtoUtils}
+
+class AbbreviateSuite extends SparkFunSuite {
+
+  test("truncate string: simple SQL text") {
+val message = proto.SQL.newBuilder().setQuery("x" * 1024).build()
+
+Seq(1, 16, 256, 512, 1024, 2048).foreach { threshold =>
+  val truncated = ProtoUtils.abbreviate(message, threshold)
+  assert(truncated.isInstanceOf[proto.SQL])
+  val truncatedSQL = truncated.asInstanceOf[proto.SQL]
+
+  if (threshold < 1024) {
+assert(truncatedSQL.getQuery.indexOf("[truncated") === threshold)
+  } else {
+assert(truncatedSQL.getQuery.indexOf("[truncated") === -1)
+assert(truncatedSQL.getQuery.length === 1024)
+  }
+}
+  }
+
+  test("truncate string: nested message") {
+val sql = proto.Relation
+  .newBuilder()
+  .setSql(
+proto.SQL
+  .newBuilder()
+  .setQuery("x" * 1024)
+  .build())
+  .build()
+val drop = proto.Relation
+  .newBuilder()
+  .setDrop(
+proto.Drop
+  .newBuilder()
+  .setInput(sql)
+  .addAllColumnNames(Seq("a", "b").asJava)
+  .build())
+  .build()
+val limit = proto.Relation
+  .newBuilder()
+  .setLimit(
+proto.Limit
+  .newBuilder()
+  .setInput(drop)
+  .setLimit(100)
+  .build())
+  .build()
+
+Seq(1, 16, 256, 512, 1024, 2048).foreach { threshold =>
+  val truncated = ProtoUtils.abbreviate(limit, threshold)
+  assert(truncated.isInstanceOf[proto.Relation])
+
+  val truncatedLimit = truncated.asInstanceOf[proto.Relation].getLimit
+  assert(truncatedLimit.getLimit === 100)
+
+  val truncatedDrop = truncatedLimit.getInput.getDrop
+  assert(truncatedDrop.getColumnNamesList.asScala.toSeq === Seq("a", "b"))
+
+  val truncatedSQL = truncatedDrop.getInput.getSql
+
+  if (threshold < 1024) {
+assert(truncatedSQL.getQuery.indexOf("[truncated") === threshold)
+  } else {
+assert(truncatedSQL.getQuery.indexOf("[truncated") === -1)
+assert(truncatedSQL.getQuery.length === 1024)
+  }
+}
+  }
+
+  test("truncate bytes: simple python udf") {

(spark) branch master updated (5e1b904ca54f -> 29db77653fe0)

2023-12-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 5e1b904ca54f [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` 
instance when `checkVersion` throw Exception
 add 29db77653fe0 [SPARK-46419][PS][TESTS][FOLLOWUP] Reorganize 
DatetimeIndexTests: Factor out remaining slow tests

No new revisions were added by this update.

Summary of changes:
 dev/sparktestsupport/modules.py|  8 +++
 .../connect/indexes/test_parity_datetime_floor.py  | 41 ++
 .../connect/indexes/test_parity_datetime_iso.py| 41 ++
 .../connect/indexes/test_parity_datetime_map.py| 41 ++
 .../connect/indexes/test_parity_datetime_round.py  | 41 ++
 .../pyspark/pandas/tests/indexes/test_datetime.py  | 41 --
 .../pandas/tests/indexes/test_datetime_floor.py| 48 
 .../pandas/tests/indexes/test_datetime_iso.py  | 47 
 .../pandas/tests/indexes/test_datetime_map.py  | 65 ++
 .../pandas/tests/indexes/test_datetime_round.py| 48 
 10 files changed, 380 insertions(+), 41 deletions(-)
 create mode 100644 
python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_floor.py
 create mode 100644 
python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_iso.py
 create mode 100644 
python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_map.py
 create mode 100644 
python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_round.py
 create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_floor.py
 create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_iso.py
 create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_map.py
 create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_round.py


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated: [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception

2023-12-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5e1b904ca54f [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` 
instance when `checkVersion` throw Exception
5e1b904ca54f is described below

commit 5e1b904ca54f8eddc5315933e43edc8bdd0d2982
Author: yangjie01 
AuthorDate: Sun Dec 17 13:22:13 2023 -0800

[SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when 
`checkVersion` throw Exception

### What changes were proposed in this pull request?
In the process of initializing the `DB` in 
`RocksDBProvider/LevelDBProvider`, there is a `checkVersion` step that may 
throw an exception. After the exception is thrown, the upper-level caller 
cannot hold the already opened `RockDB/LevelDB` instance, so it cannot perform 
resource cleanup, which poses a potential risk of handle leakage. So this PR 
manually closes the `RocksDB/LevelDB` instance when `checkVersion` throws an 
exception.

### Why are the changes needed?
Should close the `RocksDB/LevelDB` instance when `checkVersion` throw 
Exception

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44327 from LuciferYang/SPARK-46389.

Authored-by: yangjie01 
Signed-off-by: Dongjoon Hyun 
---
 .../main/java/org/apache/spark/network/util/LevelDBProvider.java   | 7 ++-
 .../main/java/org/apache/spark/network/util/RocksDBProvider.java   | 4 
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java
 
b/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java
index b27e3beb77ef..aa8be0c663bc 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java
@@ -80,7 +80,12 @@ public class LevelDBProvider {
 }
   }
   // if there is a version mismatch, we throw an exception, which means 
the service is unusable
-  checkVersion(tmpDb, version, mapper);
+  try {
+checkVersion(tmpDb, version, mapper);
+  } catch (IOException ioe) {
+tmpDb.close();
+throw ioe;
+  }
 }
 return tmpDb;
   }
diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java
 
b/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java
index f1f702c44245..f3b7b48355a0 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java
@@ -100,7 +100,11 @@ public class RocksDBProvider {
   // is unusable
   checkVersion(tmpDb, version, mapper);
 } catch (RocksDBException e) {
+  tmpDb.close();
   throw new IOException(e.getMessage(), e);
+} catch (IOException ioe) {
+  tmpDb.close();
+  throw ioe;
 }
   }
   return tmpDb;


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



(spark) branch master updated: [SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite`

2023-12-17 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0745bb507f36 [SPARK-46376][SQL][TESTS] Simplify the code to generate 
the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite`
0745bb507f36 is described below

commit 0745bb507f36b8201d49d886fc5da436274e8b85
Author: yangjie01 
AuthorDate: Sun Dec 17 13:20:51 2023 -0800

[SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball 
`filename` in the `HiveExternalCatalogVersionsSuite`

### What changes were proposed in this pull request?
This PR simplifies the code used to generate the Spark tarball `filename` 
in `HiveExternalCatalogVersionsSuite` because the minimum tested version is 
Spark 3.4.

### Why are the changes needed?
Simplify the code to generate the Spark tarball `filename`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44307 from LuciferYang/SPARK-46376.

Authored-by: yangjie01 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala  | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index 52f20595a10a..ee2e64bc1905 100644
--- 
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ 
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -40,8 +40,8 @@ import org.apache.spark.sql.catalyst.catalog.CatalogTableType
 import org.apache.spark.sql.internal.StaticSQLConf.WAREHOUSE_PATH
 import org.apache.spark.sql.test.SQLTestUtils
 import org.apache.spark.tags.{ExtendedHiveTest, SlowHiveTest}
-import org.apache.spark.util.{Utils, VersionUtils}
 import org.apache.spark.util.ArrayImplicits._
+import org.apache.spark.util.Utils
 
 /**
  * Test HiveExternalCatalog backward compatibility.
@@ -95,13 +95,7 @@ class HiveExternalCatalogVersionsSuite extends 
SparkSubmitTestUtils {
   mirrors.distinct :+ "https://archive.apache.org/dist; :+ 
PROCESS_TABLES.releaseMirror
 logInfo(s"Trying to download Spark $version from $sites")
 for (site <- sites) {
-  val filename = VersionUtils.majorMinorPatchVersion(version) match {
-case Some((major, _, _)) if major > 3 => 
s"spark-$version-bin-hadoop3.tgz"
-case Some((3, minor, _)) if minor >= 3 => 
s"spark-$version-bin-hadoop3.tgz"
-case Some((3, minor, _)) if minor < 3 => 
s"spark-$version-bin-hadoop3.2.tgz"
-case Some((_, _, _)) => s"spark-$version-bin-hadoop2.7.tgz"
-case None => s"spark-$version-bin-hadoop2.7.tgz"
-  }
+  val filename = s"spark-$version-bin-hadoop3.tgz"
   val url = s"$site/spark/spark-$version/$filename"
   logInfo(s"Downloading Spark $version from $url")
   try {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



Re: [PR] Add news, release note, download link for Apache Spark 3.3.4 [spark-website]

2023-12-17 Thread via GitHub


dongjoon-hyun commented on PR #496:
URL: https://github.com/apache/spark-website/pull/496#issuecomment-1859282216

   Could you update the search engine part, @gengliangwang ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org