snmvaughan commented on PR #4415: URL: https://github.com/apache/hadoop/pull/4415#issuecomment-1206331264
I've been focusing on speeding up the build process in order to aid in development. There are several reasons why the tests can't safely run in parallel, which is why I've started looking into short-term opportunities for speedup in the meantime. My plan of attack has been to do the following: 1. Build the distribution in parallel (this pull request). Some time savings 2. Distribute the tests to run in parallel (each in their own containers, but running sequentially within that container). More time savings, but flaky tests are still an issue. This requires some tweaks to a few tests, which I'll be submitting back in other pull requests. 3. Fix tests that break basic conventions that can lead to impacts between tests. Example: Tests that require write access to shared resources (such as files in the classpath). 4. Fix flaky test classes that fail to run consistently even in their own containers. More time savings since you don't have to wait for tests to time out. This is iterative, but once this is completed you can develop knowing that a failure is actually caused by a change. 5. Based on lessons learned, consider additional improvements. Example: Switching to using JUnit support for temporary directories to help with parallel execution In addition, I've been starting small by limiting the test runs to a focused subset. Currently I execute tests for `hadoop-hdfs-client` and `hadoop-hdfs` along with their dependencies with every build. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
