Patrick Wendell created SPARK-1455:
--------------------------------------

             Summary: Determine which test suites to run based on code changes
                 Key: SPARK-1455
                 URL: https://issues.apache.org/jira/browse/SPARK-1455
             Project: Spark
          Issue Type: Improvement
          Components: Project Infra
            Reporter: Patrick Wendell
             Fix For: 1.1.0


Right now we run the entire set of tests for every change. This means the tests 
take a long time. Our pull request builder checks out the merge branch from 
git, so we could do a diff and figure out what source files were changed, and 
run a more isolated set of tests. We should just run tests in a way that 
reflects the inter-dependencies of the project. E.g:

- If Spark core is modified, we should run all tests
- If just SQL is modified, we should run only the SQL tests
- If just Streaming is modified, we should run only the streaming tests
- If just Pyspark is modified, we only run the PySpark tests.

And so on. I think this would reduce the RTT of the tests a lot and it should 
be pretty easy to accomplish with some scripting foo.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to