This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new d39e5f1a55d3 [SPARK-50181][INFRA][DOCS] Remove
`run-tests-jenkins`-related stuff
d39e5f1a55d3 is described below
commit d39e5f1a55d32a8d2cd515cb2dea5eeb85a8be93
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Wed Oct 30 20:24:41 2024 -0700
[SPARK-50181][INFRA][DOCS] Remove `run-tests-jenkins`-related stuff
### What changes were proposed in this pull request?
This PR aims to remove `run-tests-jenkins`-related stuff.
### Why are the changes needed?
Apache Spark community successfully finished to unify all CIs into `GitHub
Action`-based one.
In these days, we don't use `Jenkins`, `AppVeyor`, and `Scaleway`-based CIs.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Manual review.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #48713 from dongjoon-hyun/SPARK-50181.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.github/labeler.yml | 6 +-
dev/run-tests-jenkins | 37 -------
dev/run-tests-jenkins.py | 236 -----------------------------------------
dev/tests/pr_merge_ability.sh | 39 -------
dev/tests/pr_public_classes.sh | 73 -------------
docs/building-spark.md | 32 ------
6 files changed, 2 insertions(+), 421 deletions(-)
diff --git a/.github/labeler.yml b/.github/labeler.yml
index 5b5564418724..6617acbf9187 100644
--- a/.github/labeler.yml
+++ b/.github/labeler.yml
@@ -26,16 +26,14 @@ INFRA:
'.asf.yaml',
'.gitattributes',
'.gitignore',
- 'dev/merge_spark_pr.py',
- 'dev/run-tests-jenkins*'
+ 'dev/merge_spark_pr.py'
]
BUILD:
- changed-files:
- all-globs-to-any-file: [
'dev/**/*',
- '!dev/merge_spark_pr.py',
- '!dev/run-tests-jenkins*'
+ '!dev/merge_spark_pr.py'
]
- any-glob-to-any-file: [
'build/**/*',
diff --git a/dev/run-tests-jenkins b/dev/run-tests-jenkins
deleted file mode 100755
index c5bf160380b5..000000000000
--- a/dev/run-tests-jenkins
+++ /dev/null
@@ -1,37 +0,0 @@
-#!/usr/bin/env bash
-
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-# Wrapper script that runs the Spark tests then reports QA results
-# to github via its API.
-# Environment variables are populated by the code here:
-#
https://github.com/jenkinsci/ghprb-plugin/blob/master/src/main/java/org/jenkinsci/plugins/ghprb/GhprbTrigger.java#L139
-
-FWDIR="$( cd "$( dirname "$0" )/.." && pwd )"
-cd "$FWDIR"
-
-export PATH=/home/anaconda/envs/py36/bin:$PATH
-export LANG="en_US.UTF-8"
-
-PYTHON_VERSION_CHECK=$(python3 -c 'import sys; print(sys.version_info < (3, 8,
0))')
-if [[ "$PYTHON_VERSION_CHECK" == "True" ]]; then
- echo "Python versions prior to 3.8 are not supported."
- exit -1
-fi
-
-exec python3 -u ./dev/run-tests-jenkins.py "$@"
diff --git a/dev/run-tests-jenkins.py b/dev/run-tests-jenkins.py
deleted file mode 100755
index aa82b28e3821..000000000000
--- a/dev/run-tests-jenkins.py
+++ /dev/null
@@ -1,236 +0,0 @@
-#!/usr/bin/env python3
-
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-import os
-import sys
-import json
-import functools
-import subprocess
-from urllib.request import urlopen
-from urllib.request import Request
-from urllib.error import HTTPError, URLError
-
-from sparktestsupport import SPARK_HOME, ERROR_CODES
-from sparktestsupport.shellutils import run_cmd
-
-
-def print_err(msg):
- """
- Given a set of arguments, will print them to the STDERR stream
- """
- print(msg, file=sys.stderr)
-
-
-def post_message_to_github(msg, ghprb_pull_id):
- print("Attempting to post to GitHub...")
-
- api_url = os.getenv("GITHUB_API_BASE",
"https://api.github.com/repos/apache/spark")
- url = api_url + "/issues/" + ghprb_pull_id + "/comments"
- github_oauth_key = os.environ["GITHUB_OAUTH_KEY"]
-
- posted_message = json.dumps({"body": msg})
- request = Request(
- url,
- headers={
- "Authorization": "token %s" % github_oauth_key,
- "Content-Type": "application/json",
- },
- data=posted_message.encode("utf-8"),
- )
- try:
- response = urlopen(request)
-
- if response.getcode() == 201:
- print(" > Post successful.")
- except HTTPError as http_e:
- print_err("Failed to post message to GitHub.")
- print_err(" > http_code: %s" % http_e.code)
- print_err(" > api_response: %s" % http_e.read())
- print_err(" > data: %s" % posted_message)
- except URLError as url_e:
- print_err("Failed to post message to GitHub.")
- print_err(" > urllib_status: %s" % url_e.reason[1])
- print_err(" > data: %s" % posted_message)
-
-
-def pr_message(
- build_display_name, build_url, ghprb_pull_id, short_commit_hash,
commit_url, msg, post_msg=""
-):
- # align the arguments properly for string formatting
- str_args = (
- build_display_name,
- msg,
- build_url,
- ghprb_pull_id,
- short_commit_hash,
- commit_url,
- str(" " + post_msg + ".") if post_msg else ".",
- )
- return "**[Test build %s %s](%stestReport)** for PR %s at commit
[`%s`](%s)%s" % str_args
-
-
-def run_pr_checks(pr_tests, ghprb_actual_commit, sha1):
- """
- Executes a set of pull request checks to ease development and report
issues with various
- components such as style, linting, dependencies, compatibilities, etc.
- @return a list of messages to post back to GitHub
- """
- # Ensure we save off the current HEAD to revert to
- current_pr_head = run_cmd(["git", "rev-parse", "HEAD"],
return_output=True).strip()
- pr_results = list()
-
- for pr_test in pr_tests:
- test_name = pr_test + ".sh"
- pr_results.append(
- run_cmd(
- [
- "bash",
- os.path.join(SPARK_HOME, "dev", "tests", test_name),
- ghprb_actual_commit,
- sha1,
- ],
- return_output=True,
- ).rstrip()
- )
- # Ensure, after each test, that we're back on the current PR
- run_cmd(["git", "checkout", "-f", current_pr_head])
- return pr_results
-
-
-def run_tests(tests_timeout):
- """
- Runs the `dev/run-tests` script and responds with the correct error message
- under the various failure scenarios.
- @return a tuple containing the test result code and the result note to
post to GitHub
- """
-
- test_result_code = subprocess.Popen(
- ["timeout", tests_timeout, os.path.join(SPARK_HOME, "dev",
"run-tests")]
- ).wait()
-
- failure_note_by_errcode = {
- # error to denote run-tests script failures:
- 1: "executing the `dev/run-tests` script",
- ERROR_CODES["BLOCK_GENERAL"]: "some tests",
- ERROR_CODES["BLOCK_RAT"]: "RAT tests",
- ERROR_CODES["BLOCK_SCALA_STYLE"]: "Scala style tests",
- ERROR_CODES["BLOCK_JAVA_STYLE"]: "Java style tests",
- ERROR_CODES["BLOCK_PYTHON_STYLE"]: "Python style tests",
- ERROR_CODES["BLOCK_R_STYLE"]: "R style tests",
- ERROR_CODES["BLOCK_DOCUMENTATION"]: "to generate documentation",
- ERROR_CODES["BLOCK_BUILD"]: "to build",
- ERROR_CODES["BLOCK_BUILD_TESTS"]: "build dependency tests",
- ERROR_CODES["BLOCK_MIMA"]: "MiMa tests",
- ERROR_CODES["BLOCK_SPARK_UNIT_TESTS"]: "Spark unit tests",
- ERROR_CODES["BLOCK_PYSPARK_UNIT_TESTS"]: "PySpark unit tests",
- ERROR_CODES["BLOCK_PYSPARK_PIP_TESTS"]: "PySpark pip packaging tests",
- ERROR_CODES["BLOCK_SPARKR_UNIT_TESTS"]: "SparkR unit tests",
- ERROR_CODES["BLOCK_TIMEOUT"]: "from timeout after a configured wait of
`%s`"
- % (tests_timeout),
- }
-
- if test_result_code == 0:
- test_result_note = " * This patch passes all tests."
- else:
- note = failure_note_by_errcode.get(
- test_result_code, "due to an unknown error code, %s" %
test_result_code
- )
- test_result_note = " * This patch **fails %s**." % note
-
- return [test_result_code, test_result_note]
-
-
-def main():
- # Important Environment Variables
- # ---
- # $ghprbActualCommit
- # This is the hash of the most recent commit in the PR.
- # The merge-base of this and master is the commit from which the PR was
branched.
- # $sha1
- # If the patch merges cleanly, this is a reference to the merge commit
hash
- # (e.g. "origin/pr/2606/merge").
- # If the patch does not merge cleanly, it is equal to $ghprbActualCommit.
- # The merge-base of this and master in the case of a clean merge is the
most recent commit
- # against master.
- ghprb_pull_id = os.environ["ghprbPullId"]
- ghprb_actual_commit = os.environ["ghprbActualCommit"]
- ghprb_pull_title = os.environ["ghprbPullTitle"].lower()
- sha1 = os.environ["sha1"]
-
- # Marks this build as a pull request build.
- os.environ["SPARK_JENKINS_PRB"] = "true"
- # Switch to a Maven-based build if the PR title contains "test-maven":
- if "test-maven" in ghprb_pull_title:
- os.environ["SPARK_JENKINS_BUILD_TOOL"] = "maven"
- if "test-hadoop3" in ghprb_pull_title:
- os.environ["SPARK_JENKINS_BUILD_PROFILE"] = "hadoop3"
- # Switch the Scala profile based on the PR title:
- if "test-scala2.13" in ghprb_pull_title:
- os.environ["SPARK_JENKINS_BUILD_SCALA_PROFILE"] = "scala2.13"
-
- build_display_name = os.environ["BUILD_DISPLAY_NAME"]
- build_url = os.environ["BUILD_URL"]
-
- project_url = os.getenv("SPARK_PROJECT_URL",
"https://github.com/apache/spark")
- commit_url = project_url + "/commit/" + ghprb_actual_commit
-
- # GitHub doesn't auto-link short hashes when submitted via the API,
unfortunately. :(
- short_commit_hash = ghprb_actual_commit[0:7]
-
- # format: http://linux.die.net/man/1/timeout
- # must be less than the timeout configured on Jenkins. Usually Jenkins's
timeout is higher
- # then this. Please consult with the build manager or a committer when it
should be increased.
- tests_timeout = "500m"
-
- # Array to capture all test names to run on the pull request. These tests
are represented
- # by their file equivalents in the dev/tests/ directory.
- #
- # To write a PR test:
- # * the file must reside within the dev/tests directory
- # * be an executable bash script
- # * accept three arguments on the command line, the first being the
GitHub PR long commit
- # hash, the second the GitHub SHA1 hash, and the final the current PR
hash
- # * and, lastly, return string output to be included in the pr message
output that will
- # be posted to GitHub
- pr_tests = ["pr_merge_ability", "pr_public_classes"]
-
- # `bind_message_base` returns a function to generate messages for GitHub
posting
- github_message = functools.partial(
- pr_message, build_display_name, build_url, ghprb_pull_id,
short_commit_hash, commit_url
- )
-
- # post start message
- post_message_to_github(github_message("has started"), ghprb_pull_id)
-
- pr_check_results = run_pr_checks(pr_tests, ghprb_actual_commit, sha1)
-
- test_result_code, test_result_note = run_tests(tests_timeout)
-
- # post end message
- result_message = github_message("has finished")
- result_message += "\n" + test_result_note + "\n"
- result_message += "\n".join(pr_check_results)
-
- post_message_to_github(result_message, ghprb_pull_id)
-
- sys.exit(test_result_code)
-
-
-if __name__ == "__main__":
- main()
diff --git a/dev/tests/pr_merge_ability.sh b/dev/tests/pr_merge_ability.sh
deleted file mode 100755
index a32667730f76..000000000000
--- a/dev/tests/pr_merge_ability.sh
+++ /dev/null
@@ -1,39 +0,0 @@
-#!/usr/bin/env bash
-
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-#
-# This script follows the base format for testing pull requests against
-# another branch and returning results to be published. More details can be
-# found at dev/run-tests-jenkins.
-#
-# Arg1: The GitHub Pull Request Actual Commit
-# known as `ghprbActualCommit` in `run-tests-jenkins`
-# Arg2: The SHA1 hash
-# known as `sha1` in `run-tests-jenkins`
-#
-
-ghprbActualCommit="$1"
-sha1="$2"
-
-# check PR merge-ability
-if [ "${sha1}" == "${ghprbActualCommit}" ]; then
- echo " * This patch **does not merge cleanly**."
-else
- echo " * This patch merges cleanly."
-fi
diff --git a/dev/tests/pr_public_classes.sh b/dev/tests/pr_public_classes.sh
deleted file mode 100755
index ad1ad5e73659..000000000000
--- a/dev/tests/pr_public_classes.sh
+++ /dev/null
@@ -1,73 +0,0 @@
-#!/usr/bin/env bash
-
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-#
-# This script follows the base format for testing pull requests against
-# another branch and returning results to be published. More details can be
-# found at dev/run-tests-jenkins.
-#
-# Arg1: The GitHub Pull Request Actual Commit
-# known as `ghprbActualCommit` in `run-tests-jenkins`
-
-ghprbActualCommit="$1"
-
-# $ghprbActualCommit is an automatic merge commit generated by GitHub; its
parents are some Spark
-# master commit and the tip of the pull request branch.
-
-# By diffing$ghprbActualCommit^...$ghprbActualCommit and filtering to examine
the diffs of only
-# non-test files, we can get changes introduced in the PR and not anything
else added to master
-# since the PR was branched.
-
-# Handle differences between GNU and BSD sed
-if [[ $(uname) == "Darwin" ]]; then
- SED='sed -E'
-else
- SED='sed -r'
-fi
-
-source_files=$(
- git diff $ghprbActualCommit^...$ghprbActualCommit --name-only `# diff patch
against master from branch point` \
- | grep -v -e "\/test" `# ignore files in
test directories` \
- | grep -e "\.py$" -e "\.java$" -e "\.scala$" `# include only code
files` \
- | tr "\n" " "
-)
-
-new_public_classes=$(
- git diff $ghprbActualCommit^...$ghprbActualCommit ${source_files} `#
diff patch against master from branch point` \
- | grep "^\+" `# filter in only added lines` \
- | $SED -e "s/^\+//g" `# remove the leading +` \
- | grep -e "trait " -e "class " `# filter in lines with these
key words` \
- | grep -e "{" -e "(" `# filter in lines with these
key words, too` \
- | grep -v -e "\@\@" -e "private" `# exclude lines with these
words` \
- | grep -v -e "^// " -e "^/\*" -e "^ \* " `# exclude comment lines` \
- | $SED -e "s/\{.*//g" `# remove from the { onwards` \
- | $SED -e "s/\}//g" `# just in case, remove }; they
mess the JSON` \
- | $SED -e "s/\"/\\\\\"/g" `# escape double quotes; they
mess the JSON` \
- | $SED -e "s/^(.*)$/\`\1\`/g" `# surround with backticks for
style` \
- | $SED -e "s/^/ \* /g" `# prepend ' *' to start of
line` \
- | $SED -e "s/$/\\\n/g" `# append newline to end of
line` \
- | tr -d "\n" `# remove actual LF characters`
-)
-
-if [ -z "$new_public_classes" ]; then
- echo " * This patch adds no public classes."
-else
- public_classes_note=" * This patch adds the following public classes
_(experimental)_:"
- echo -e "${public_classes_note}\n${new_public_classes}"
-fi
diff --git a/docs/building-spark.md b/docs/building-spark.md
index 4bd749d90e1f..547add0fc9f4 100644
--- a/docs/building-spark.md
+++ b/docs/building-spark.md
@@ -283,38 +283,6 @@ Enable the profile (e.g. 2.13):
./build/sbt -Pscala-2.13 compile
-->
-## Running Jenkins tests with GitHub Enterprise
-
-While the Spark project does not maintain its own Jenkins infrastructure,
[community members like Scaleway][scaleway] do.
-
-[scaleway]: https://spark.apache.org/developer-tools.html#scaleway
-
-To run tests with Jenkins:
-
- ./dev/run-tests-jenkins
-
-If you use an individual repository or a repository on GitHub Enterprise,
export the environment variables below before running the above command.
-
-### Related environment variables
-
-<table>
-<thead><tr><th>Variable Name</th><th>Default</th><th>Meaning</th></tr></thead>
-<tr>
- <td><code>SPARK_PROJECT_URL</code></td>
- <td>https://github.com/apache/spark</td>
- <td>
- The Spark project URL of GitHub Enterprise.
- </td>
-</tr>
-<tr>
- <td><code>GITHUB_API_BASE</code></td>
- <td>https://api.github.com/repos/apache/spark</td>
- <td>
- The Spark project API server URL of GitHub Enterprise.
- </td>
-</tr>
-</table>
-
# Building and testing on an IPv6-only environment
Use Apache Spark GitBox URL because GitHub doesn't support IPv6 yet.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]