GitHub user bersprockets opened a pull request:
https://github.com/apache/spark/pull/20909
[SPARK-23776][python][test] Check for needed components/files before
running pyspark-sql tests
## What changes were proposed in this pull request?
Change pyspark-sql tests to check the following:
- Spark was built with the Hive profile
- Spark scala tests were compiled
If either condition is not met, throw an exception with a message
explaining how to appropriately build Spark.
These checks are similar to the ones found in the pyspark-streaming tests.
These required files will be missing if you follow the sbt build
instructions. They are less likely to be missing if you follow the mvn build
instructions (mvn compiles the test scala files, and there are mvn build
instructions for running the pyspark tests).
## How was this patch tested?
For sbt build:
- run ./build/sbt package
- run python/run-tests --modules "pyspark-sql" --python-executables
python2.7
- see failure, follow sbt instructions in exception message
- run test again
- see second failure (sbt only), follow sbt instructions in exception
message
- run test again, verify success
- repeat for python3.4
For mvn build:
- run ./build/mvn -DskipTests clean package
- run python/run-tests --modules "pyspark-sql" --python-executables
python2.7
- see failure, follow mvn instructions in exception message
- run test again, verify success
- repeat for python3.4
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bersprockets/spark SPARK-23776
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20909.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20909
----
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]