Patrick Wendell created SPARK-1455:
--------------------------------------
Summary: Determine which test suites to run based on code changes
Key: SPARK-1455
URL: https://issues.apache.org/jira/browse/SPARK-1455
Project: Spark
Issue Type: Improvement
Components: Project Infra
Reporter: Patrick Wendell
Fix For: 1.1.0
Right now we run the entire set of tests for every change. This means the tests
take a long time. Our pull request builder checks out the merge branch from
git, so we could do a diff and figure out what source files were changed, and
run a more isolated set of tests. We should just run tests in a way that
reflects the inter-dependencies of the project. E.g:
- If Spark core is modified, we should run all tests
- If just SQL is modified, we should run only the SQL tests
- If just Streaming is modified, we should run only the streaming tests
- If just Pyspark is modified, we only run the PySpark tests.
And so on. I think this would reduce the RTT of the tests a lot and it should
be pretty easy to accomplish with some scripting foo.
--
This message was sent by Atlassian JIRA
(v6.2#6252)