Patrick Wendell created SPARK-1455: -------------------------------------- Summary: Determine which test suites to run based on code changes Key: SPARK-1455 URL: https://issues.apache.org/jira/browse/SPARK-1455 Project: Spark Issue Type: Improvement Components: Project Infra Reporter: Patrick Wendell Fix For: 1.1.0
Right now we run the entire set of tests for every change. This means the tests take a long time. Our pull request builder checks out the merge branch from git, so we could do a diff and figure out what source files were changed, and run a more isolated set of tests. We should just run tests in a way that reflects the inter-dependencies of the project. E.g: - If Spark core is modified, we should run all tests - If just SQL is modified, we should run only the SQL tests - If just Streaming is modified, we should run only the streaming tests - If just Pyspark is modified, we only run the PySpark tests. And so on. I think this would reduce the RTT of the tests a lot and it should be pretty easy to accomplish with some scripting foo. -- This message was sent by Atlassian JIRA (v6.2#6252)