GitHub user HyukjinKwon opened a pull request:
https://github.com/apache/spark/pull/23117
[WIP][SPARK-7721][INFRA] Run and generate test coverage report from Python
via Jenkins
## What changes were proposed in this pull request?
### Background
For the current status, the test script that generates coverage information
was merged
into Spark, https://github.com/apache/spark/pull/20204
So, we can generate the coverage report and site by, for example:
```
run-tests-with-coverage --python-executables=python3 --modules=pyspark-sql
```
like `run-tests` script in `./python`.
### Proposed change
The next step is to host this coverage report via `github.io` automatically
by Jenkins (see https://spark-test.github.io/pyspark-coverage-site/).
This uses my testing account for Spark, @spark-test, which is shared to
Felix and Shivaram a long time ago for testing purpose including AppVeyor.
To cut this short, this PR targets to run the coverage in
[spark-master-test-sbt-hadoop-2.7](https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/)
In the specific job, it will clone the page, and rebase the up-to-date
PySpark test coverage from the latest commit. For instance as below:
```bash
# Clone PySpark coverage site.
git clone https://github.com/spark-test/pyspark-coverage-site.git
# Copy generated coverage HTML.
cp -r .../python/test_coverage/htmlcov/* pyspark-coverage-site/
# Check out to a temporary branch.
git checkout --orphan latest_branch
# Add all the files.
git add -A
# Commit current test coverage results.
git commit -am "Coverage report at latest commit in Apache Spark"
# Delete the old branch.
git branch -D gh-pages
# Rename the temporary branch to master.
git branch -m gh-pages
# Finally, force update to our repository.
git push -f origin gh-pages
```
So, it is a one single up-to-date coverage can be shown in the `github-io`
page. The commands above were manually tested.
### TODO:
- [ ] Write a draft
- [ ] Set hidden `SPARK_TEST_KEY` for @spark-test's password in Jenkins via
Jenkins's feature.
This should be set both at `SparkPullRequestBuilder` so that we (or I)
can test and `spark-master-test-sbt-hadoop-2.7`
- [ ] Make PR builder's test passed
- [ ] Enable it this build only at spark-master-test-sbt-hadoop-2.7 right
before getting this in.
## How was this patch tested?
It will be tested via Jenkins.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HyukjinKwon/spark SPARK-7721
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23117.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23117
----
commit d88d5aa73db636f8c73ace9f83f339781ea50531
Author: hyukjinkwon <gurwls223@...>
Date: 2018-11-22T08:08:20Z
Run and generate test coverage report from Python via Jenkins
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]