(spark) branch master updated: [SPARK-46436][INFRA] Clean up compatibility configurations related to branch-3.3 daily testing in `build_and_test.yml`
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 78c2c2674b46 [SPARK-46436][INFRA] Clean up compatibility configurations related to branch-3.3 daily testing in `build_and_test.yml` 78c2c2674b46 is described below commit 78c2c2674b46fcc60e781919550bf1735afc1b85 Author: yangjie01 AuthorDate: Mon Dec 18 14:07:02 2023 +0800 [SPARK-46436][INFRA] Clean up compatibility configurations related to branch-3.3 daily testing in `build_and_test.yml` ### What changes were proposed in this pull request? This pr aims to clean up compatibility configurations related to branch-3.3 daily testing in `build_and_test.yml` Since Apache Spark 3.3 has reached EOL, there is no need for daily testing anymore. ### Why are the changes needed? Apache Spark 3.3 has reached EOL, there is no need for daily testing anymore. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #44392 from LuciferYang/3-3-daily. Authored-by: yangjie01 Signed-off-by: Ruifeng Zheng --- .github/workflows/build_and_test.yml | 33 ++--- 1 file changed, 10 insertions(+), 23 deletions(-) diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml index 27d3c86686bb..bdcb1dd1ea5c 100644 --- a/.github/workflows/build_and_test.yml +++ b/.github/workflows/build_and_test.yml @@ -57,11 +57,7 @@ jobs: GITHUB_PREV_SHA: ${{ github.event.before }} outputs: required: ${{ steps.set-outputs.outputs.required }} - image_url: >- -${{ - (inputs.branch == 'branch-3.3' && 'dongjoon/apache-spark-github-action-image:20220207') - || steps.infra-image-outputs.outputs.image_url -}} + image_url: ${{ steps.infra-image-outputs.outputs.image_url }} steps: - name: Checkout Spark repository uses: actions/checkout@v4 @@ -292,10 +288,9 @@ jobs: needs: precondition # Currently, enable docker build from cache for `master` and branch (since 3.4) jobs if: >- - (fromJson(needs.precondition.outputs.required).pyspark == 'true' || + fromJson(needs.precondition.outputs.required).pyspark == 'true' || fromJson(needs.precondition.outputs.required).lint == 'true' || - fromJson(needs.precondition.outputs.required).sparkr == 'true') && - (inputs.branch != 'branch-3.3') + fromJson(needs.precondition.outputs.required).sparkr == 'true' runs-on: ubuntu-latest permissions: packages: write @@ -684,15 +679,7 @@ jobs: - name: Java linter run: ./dev/lint-java - name: Spark connect jvm client mima check - if: inputs.branch != 'branch-3.3' run: ./dev/connect-jvm-client-mima-check -- name: Install Python linter dependencies for branch-3.3 - if: inputs.branch == 'branch-3.3' - run: | -# SPARK-44554: Copy from https://github.com/apache/spark/blob/073d0b60d31bf68ebacdc005f59b928a5902670f/.github/workflows/build_and_test.yml#L501-L508 -# Should delete this section after SPARK 3.3 EOL. -python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 'mypy==0.920' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc 'jinja2<3.0.0' 'black==21.12b0' -python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython - name: Install Python linter dependencies for branch-3.4 if: inputs.branch == 'branch-3.4' run: | @@ -708,7 +695,7 @@ jobs: python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 'mypy==0.982' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc 'jinja2<3.0.0' 'black==22.6.0' python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython 'grpcio==1.59.3' 'grpc-stubs==1.24.11' 'googleapis-common-protos-stubs==2.2.0' - name: Install Python linter dependencies - if: inputs.branch != 'branch-3.3' && inputs.branch != 'branch-3.4' && inputs.branch != 'branch-3.5' + if: inputs.branch != 'branch-3.4' && inputs.branch != 'branch-3.5' run: | python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 'mypy==0.982' 'pytest==7.1.3' 'pytest-mypy-plugins==1.9.3' numpydoc jinja2 'black==23.9.1' python3.9 -m pip install 'pandas-stubs==1.2.0.53' ipython 'grpcio==1.59.3' 'grpc-stubs==1.24.11' 'googleapis-common-protos-stubs==2.2.0' @@ -729,16 +716,16 @@ jobs: if: inputs.branch == 'branch-3.5' run: if test -f ./dev/connect-check-protos.py; then PATH=$PATH:$HOME/buf/bin PYTHON_EXECUTABLE=python3.9 ./dev/connect-check-protos.py; fi # Should delete this section after SPARK 3.5 EOL. -- name: Install JavaScript linter dependencies
(spark) branch master updated: [SPARK-46435][INFRA] Exclude `branch-3.3` from `publish_snapshot.yml`
This is an automated email from the ASF dual-hosted git repository. yangjie01 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 51dd7268f895 [SPARK-46435][INFRA] Exclude `branch-3.3` from `publish_snapshot.yml` 51dd7268f895 is described below commit 51dd7268f895032a986a163318ae4fd31891a8a3 Author: Dongjoon Hyun AuthorDate: Mon Dec 18 13:53:10 2023 +0800 [SPARK-46435][INFRA] Exclude `branch-3.3` from `publish_snapshot.yml` ### What changes were proposed in this pull request? This PR aims to exclude `branch-3.3` from `publish_snapshot.yml`. ### Why are the changes needed? We don't need to publish the snapshot of branch-3.3 because it reached the End-Of-Life status. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44390 from dongjoon-hyun/SPARK-46435. Authored-by: Dongjoon Hyun Signed-off-by: yangjie01 --- .github/workflows/publish_snapshot.yml | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/.github/workflows/publish_snapshot.yml b/.github/workflows/publish_snapshot.yml index ff51887e0df2..735c0f3d180b 100644 --- a/.github/workflows/publish_snapshot.yml +++ b/.github/workflows/publish_snapshot.yml @@ -28,7 +28,7 @@ on: description: 'list of branches to publish (JSON)' required: true # keep in sync with default value of strategy matrix 'branch' -default: '["master", "branch-3.5", "branch-3.4", "branch-3.3"]' +default: '["master", "branch-3.5", "branch-3.4"]' jobs: publish-snapshot: @@ -38,7 +38,7 @@ jobs: fail-fast: false matrix: # keep in sync with default value of workflow_dispatch input 'branch' -branch: ${{ fromJSON( inputs.branch || '["master", "branch-3.5", "branch-3.4", "branch-3.3"]' ) }} +branch: ${{ fromJSON( inputs.branch || '["master", "branch-3.5", "branch-3.4"]' ) }} steps: - name: Checkout Spark repository uses: actions/checkout@v4 @@ -52,13 +52,13 @@ jobs: restore-keys: | snapshot-maven- - name: Install Java 8 for branch-3.x - if: matrix.branch == 'branch-3.5' || matrix.branch == 'branch-3.4' || matrix.branch == 'branch-3.3' + if: matrix.branch == 'branch-3.5' || matrix.branch == 'branch-3.4' uses: actions/setup-java@v4 with: distribution: temurin java-version: 8 - name: Install Java 17 - if: matrix.branch != 'branch-3.5' && matrix.branch != 'branch-3.4' && matrix.branch != 'branch-3.3' + if: matrix.branch != 'branch-3.5' && matrix.branch != 'branch-3.4' uses: actions/setup-java@v4 with: distribution: temurin - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (5edd0ec696d6 -> 1b413a4bd61e)
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 5edd0ec696d6 [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8 add 1b413a4bd61e [SPARK-46433][PS][TESTS] Reorganize `OpsOnDiffFramesGroupByTests`: Factor out `test_aggregate/apply/cum*` No new revisions were added by this update. Summary of changes: dev/sparktestsupport/modules.py| 6 + ...orrwith.py => test_parity_groupby_aggregate.py} | 8 +- ..._parity_cov.py => test_parity_groupby_apply.py} | 9 +- ...rrwith.py => test_parity_groupby_cumulative.py} | 8 +- .../diff_frames_ops/test_groupby_aggregate.py | 139 .../tests/diff_frames_ops/test_groupby_apply.py| 82 +++ .../diff_frames_ops/test_groupby_cumulative.py | 182 .../tests/test_ops_on_diff_frames_groupby.py | 235 - 8 files changed, 421 insertions(+), 248 deletions(-) copy python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_corrwith.py => test_parity_groupby_aggregate.py} (87%) copy python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_cov.py => test_parity_groupby_apply.py} (88%) copy python/pyspark/pandas/tests/connect/diff_frames_ops/{test_parity_corrwith.py => test_parity_groupby_cumulative.py} (87%) create mode 100644 python/pyspark/pandas/tests/diff_frames_ops/test_groupby_aggregate.py create mode 100644 python/pyspark/pandas/tests/diff_frames_ops/test_groupby_apply.py create mode 100644 python/pyspark/pandas/tests/diff_frames_ops/test_groupby_cumulative.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8
This is an automated email from the ASF dual-hosted git repository. ruifengz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5edd0ec696d6 [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8 5edd0ec696d6 is described below commit 5edd0ec696d6eb02a4bd8bbce086f0ad2d4b38d6 Author: Dongjoon Hyun AuthorDate: Mon Dec 18 10:02:34 2023 +0800 [SPARK-46317][INFRA][FOLLOWUP] Install `lxml` in PyPy3.8 ### What changes were proposed in this pull request? This PR aims to install `lxml` in PyPy3.8. ### Why are the changes needed? #44247 seems to break the `PyPy3.8` daily CI. - https://github.com/apache/spark/actions/runs/7239278196/job/19721018105 ``` Traceback (most recent call last): File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/__w/spark/spark/python/pyspark/sql/tests/test_session.py", line 22, in from lxml import etree ModuleNotFoundError: No module named 'lxml' ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44391 from dongjoon-hyun/SPARK-46317. Authored-by: Dongjoon Hyun Signed-off-by: Ruifeng Zheng --- dev/infra/Dockerfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile index cade845d911b..9e81accbd3b7 100644 --- a/dev/infra/Dockerfile +++ b/dev/infra/Dockerfile @@ -92,7 +92,7 @@ RUN mkdir -p /usr/local/pypy/pypy3.8 && \ ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3.8 && \ ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3 RUN curl -sS https://bootstrap.pypa.io/get-pip.py | pypy3 -RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.1.4' scipy coverage matplotlib +RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.1.4' scipy coverage matplotlib lxml ARG BASIC_PIP_PKGS="numpy pyarrow>=14.0.0 six==1.16.0 pandas<=2.1.4 scipy unittest-xml-reporting plotly>=4.8 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 scikit-learn>=1.3.2" - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46434][INFRA] Remove `build_branch33.yml`
This is an automated email from the ASF dual-hosted git repository. yao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new dff0d10d7acd [SPARK-46434][INFRA] Remove `build_branch33.yml` dff0d10d7acd is described below commit dff0d10d7acd7e767c6900499d8342c3d9d2777b Author: Dongjoon Hyun AuthorDate: Mon Dec 18 09:51:14 2023 +0800 [SPARK-46434][INFRA] Remove `build_branch33.yml` ### What changes were proposed in this pull request? This PR aims to remove the daily CI on `branch-3.3`. ### Why are the changes needed? `branch-3.3` reached the End-Of-Life. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44389 from dongjoon-hyun/SPARK-46434. Authored-by: Dongjoon Hyun Signed-off-by: Kent Yao --- .github/workflows/build_branch33.yml | 51 1 file changed, 51 deletions(-) diff --git a/.github/workflows/build_branch33.yml b/.github/workflows/build_branch33.yml deleted file mode 100644 index fc6ce7028fc9.. --- a/.github/workflows/build_branch33.yml +++ /dev/null @@ -1,51 +0,0 @@ -# -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, -# software distributed under the License is distributed on an -# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY -# KIND, either express or implied. See the License for the -# specific language governing permissions and limitations -# under the License. -# - -name: "Build (branch-3.3, Scala 2.13, Hadoop 3, JDK 8)" - -on: - schedule: -- cron: '0 7 * * *' - -jobs: - run-build: -permissions: - packages: write -name: Run -uses: ./.github/workflows/build_and_test.yml -if: github.repository == 'apache/spark' -with: - java: 8 - branch: branch-3.3 - hadoop: hadoop3 - envs: >- -{ - "SCALA_PROFILE": "scala2.13", - "PYTHON_TO_TEST": "", - "ORACLE_DOCKER_IMAGE_NAME": "gvenzl/oracle-xe:21.3.0" -} - jobs: >- -{ - "build": "true", - "pyspark": "true", - "sparkr": "true", - "tpcds-1g": "true", - "docker-integration-tests": "true", - "lint" : "true" -} - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (96bf373e9002 -> d163b319e21a)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 96bf373e9002 [SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate` add d163b319e21a [SPARK-46431][SS] Convert `IllegalStateException` to `internalError` in session iterators No new revisions were added by this update. Summary of changes: .../aggregate/MergingSessionsIterator.scala| 5 +- .../aggregate/UpdatingSessionsIterator.scala | 5 +- .../streaming/MergingSessionsIteratorSuite.scala | 30 +++ .../streaming/UpdatingSessionsIteratorSuite.scala | 58 ++ 4 files changed, 64 insertions(+), 34 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 96bf373e9002 [SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate` 96bf373e9002 is described below commit 96bf373e90026dac8ef5020fe3032107c11df73f Author: Ruifeng Zheng AuthorDate: Sun Dec 17 13:27:09 2023 -0800 [SPARK-46430][CONNECT][TESTS] Add test for `ProtoUtils.abbreviate` ### What changes were proposed in this pull request? Add test for `ProtoUtils.abbreviate` ### Why are the changes needed? `ProtoUtils.abbreviate` is not tested, for better test coverage we are going to improve this functionality, before that we should protect its behavior. ### Does this PR introduce _any_ user-facing change? no, test-only ### How was this patch tested? added ut ### Was this patch authored or co-authored using generative AI tooling? no Closes #44383 from zhengruifeng/proto_utils_test. Authored-by: Ruifeng Zheng Signed-off-by: Dongjoon Hyun --- .../sql/connect/messages/AbbreviateSuite.scala | 121 + 1 file changed, 121 insertions(+) diff --git a/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala new file mode 100644 index ..9a712e9b7bf1 --- /dev/null +++ b/connector/connect/server/src/test/scala/org/apache/spark/sql/connect/messages/AbbreviateSuite.scala @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.sql.connect.messages + +import scala.jdk.CollectionConverters._ + +import com.google.protobuf.ByteString + +import org.apache.spark.SparkFunSuite +import org.apache.spark.connect.proto +import org.apache.spark.sql.connect.common.{ProtoDataTypes, ProtoUtils} + +class AbbreviateSuite extends SparkFunSuite { + + test("truncate string: simple SQL text") { +val message = proto.SQL.newBuilder().setQuery("x" * 1024).build() + +Seq(1, 16, 256, 512, 1024, 2048).foreach { threshold => + val truncated = ProtoUtils.abbreviate(message, threshold) + assert(truncated.isInstanceOf[proto.SQL]) + val truncatedSQL = truncated.asInstanceOf[proto.SQL] + + if (threshold < 1024) { +assert(truncatedSQL.getQuery.indexOf("[truncated") === threshold) + } else { +assert(truncatedSQL.getQuery.indexOf("[truncated") === -1) +assert(truncatedSQL.getQuery.length === 1024) + } +} + } + + test("truncate string: nested message") { +val sql = proto.Relation + .newBuilder() + .setSql( +proto.SQL + .newBuilder() + .setQuery("x" * 1024) + .build()) + .build() +val drop = proto.Relation + .newBuilder() + .setDrop( +proto.Drop + .newBuilder() + .setInput(sql) + .addAllColumnNames(Seq("a", "b").asJava) + .build()) + .build() +val limit = proto.Relation + .newBuilder() + .setLimit( +proto.Limit + .newBuilder() + .setInput(drop) + .setLimit(100) + .build()) + .build() + +Seq(1, 16, 256, 512, 1024, 2048).foreach { threshold => + val truncated = ProtoUtils.abbreviate(limit, threshold) + assert(truncated.isInstanceOf[proto.Relation]) + + val truncatedLimit = truncated.asInstanceOf[proto.Relation].getLimit + assert(truncatedLimit.getLimit === 100) + + val truncatedDrop = truncatedLimit.getInput.getDrop + assert(truncatedDrop.getColumnNamesList.asScala.toSeq === Seq("a", "b")) + + val truncatedSQL = truncatedDrop.getInput.getSql + + if (threshold < 1024) { +assert(truncatedSQL.getQuery.indexOf("[truncated") === threshold) + } else { +assert(truncatedSQL.getQuery.indexOf("[truncated") === -1) +assert(truncatedSQL.getQuery.length === 1024) + } +} + } + + test("truncate bytes: simple python udf") {
(spark) branch master updated (5e1b904ca54f -> 29db77653fe0)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 5e1b904ca54f [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception add 29db77653fe0 [SPARK-46419][PS][TESTS][FOLLOWUP] Reorganize DatetimeIndexTests: Factor out remaining slow tests No new revisions were added by this update. Summary of changes: dev/sparktestsupport/modules.py| 8 +++ .../connect/indexes/test_parity_datetime_floor.py | 41 ++ .../connect/indexes/test_parity_datetime_iso.py| 41 ++ .../connect/indexes/test_parity_datetime_map.py| 41 ++ .../connect/indexes/test_parity_datetime_round.py | 41 ++ .../pyspark/pandas/tests/indexes/test_datetime.py | 41 -- .../pandas/tests/indexes/test_datetime_floor.py| 48 .../pandas/tests/indexes/test_datetime_iso.py | 47 .../pandas/tests/indexes/test_datetime_map.py | 65 ++ .../pandas/tests/indexes/test_datetime_round.py| 48 10 files changed, 380 insertions(+), 41 deletions(-) create mode 100644 python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_floor.py create mode 100644 python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_iso.py create mode 100644 python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_map.py create mode 100644 python/pyspark/pandas/tests/connect/indexes/test_parity_datetime_round.py create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_floor.py create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_iso.py create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_map.py create mode 100644 python/pyspark/pandas/tests/indexes/test_datetime_round.py - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5e1b904ca54f [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception 5e1b904ca54f is described below commit 5e1b904ca54f8eddc5315933e43edc8bdd0d2982 Author: yangjie01 AuthorDate: Sun Dec 17 13:22:13 2023 -0800 [SPARK-46389][CORE] Manually close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception ### What changes were proposed in this pull request? In the process of initializing the `DB` in `RocksDBProvider/LevelDBProvider`, there is a `checkVersion` step that may throw an exception. After the exception is thrown, the upper-level caller cannot hold the already opened `RockDB/LevelDB` instance, so it cannot perform resource cleanup, which poses a potential risk of handle leakage. So this PR manually closes the `RocksDB/LevelDB` instance when `checkVersion` throws an exception. ### Why are the changes needed? Should close the `RocksDB/LevelDB` instance when `checkVersion` throw Exception ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #44327 from LuciferYang/SPARK-46389. Authored-by: yangjie01 Signed-off-by: Dongjoon Hyun --- .../main/java/org/apache/spark/network/util/LevelDBProvider.java | 7 ++- .../main/java/org/apache/spark/network/util/RocksDBProvider.java | 4 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java b/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java index b27e3beb77ef..aa8be0c663bc 100644 --- a/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java +++ b/common/network-common/src/main/java/org/apache/spark/network/util/LevelDBProvider.java @@ -80,7 +80,12 @@ public class LevelDBProvider { } } // if there is a version mismatch, we throw an exception, which means the service is unusable - checkVersion(tmpDb, version, mapper); + try { +checkVersion(tmpDb, version, mapper); + } catch (IOException ioe) { +tmpDb.close(); +throw ioe; + } } return tmpDb; } diff --git a/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java b/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java index f1f702c44245..f3b7b48355a0 100644 --- a/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java +++ b/common/network-common/src/main/java/org/apache/spark/network/util/RocksDBProvider.java @@ -100,7 +100,11 @@ public class RocksDBProvider { // is unusable checkVersion(tmpDb, version, mapper); } catch (RocksDBException e) { + tmpDb.close(); throw new IOException(e.getMessage(), e); +} catch (IOException ioe) { + tmpDb.close(); + throw ioe; } } return tmpDb; - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0745bb507f36 [SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite` 0745bb507f36 is described below commit 0745bb507f36b8201d49d886fc5da436274e8b85 Author: yangjie01 AuthorDate: Sun Dec 17 13:20:51 2023 -0800 [SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite` ### What changes were proposed in this pull request? This PR simplifies the code used to generate the Spark tarball `filename` in `HiveExternalCatalogVersionsSuite` because the minimum tested version is Spark 3.4. ### Why are the changes needed? Simplify the code to generate the Spark tarball `filename` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #44307 from LuciferYang/SPARK-46376. Authored-by: yangjie01 Signed-off-by: Dongjoon Hyun --- .../spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala index 52f20595a10a..ee2e64bc1905 100644 --- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala +++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala @@ -40,8 +40,8 @@ import org.apache.spark.sql.catalyst.catalog.CatalogTableType import org.apache.spark.sql.internal.StaticSQLConf.WAREHOUSE_PATH import org.apache.spark.sql.test.SQLTestUtils import org.apache.spark.tags.{ExtendedHiveTest, SlowHiveTest} -import org.apache.spark.util.{Utils, VersionUtils} import org.apache.spark.util.ArrayImplicits._ +import org.apache.spark.util.Utils /** * Test HiveExternalCatalog backward compatibility. @@ -95,13 +95,7 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils { mirrors.distinct :+ "https://archive.apache.org/dist; :+ PROCESS_TABLES.releaseMirror logInfo(s"Trying to download Spark $version from $sites") for (site <- sites) { - val filename = VersionUtils.majorMinorPatchVersion(version) match { -case Some((major, _, _)) if major > 3 => s"spark-$version-bin-hadoop3.tgz" -case Some((3, minor, _)) if minor >= 3 => s"spark-$version-bin-hadoop3.tgz" -case Some((3, minor, _)) if minor < 3 => s"spark-$version-bin-hadoop3.2.tgz" -case Some((_, _, _)) => s"spark-$version-bin-hadoop2.7.tgz" -case None => s"spark-$version-bin-hadoop2.7.tgz" - } + val filename = s"spark-$version-bin-hadoop3.tgz" val url = s"$site/spark/spark-$version/$filename" logInfo(s"Downloading Spark $version from $url") try { - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
Re: [PR] Add news, release note, download link for Apache Spark 3.3.4 [spark-website]
dongjoon-hyun commented on PR #496: URL: https://github.com/apache/spark-website/pull/496#issuecomment-1859282216 Could you update the search engine part, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org