FYI, this change is now merged [1]. Along with the many many other improvements made during the Gradle migration, many of our Jenkins test suites are now significantly faster than they used to be:
* Java Precommit: before ~1h30m [2], now 37m [3] (59% reduction) * Flink ValidatesRunner: before ~30m [4], now 7.5m [5] (75% reduction) * Nightly Snapshot: before ~1h [6], now ~20m [7] (67% reduction) The Dataflow ValidatesRunner test suite is back to where it was before [8]: just over 2 hours. [9]. This suite didn't improve significantly over Maven primarily due to Gradle's test parallelization. Tests are parallelized at the test class level whereas Maven Surefire supported parallelizing test cases within a class. [1] https://github.com/apache/beam/pull/5193 [2] https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/buildTimeTrend [3] https://builds.apache.org/job/beam_PreCommit_Java_GradleBuild/buildTimeTrend [4] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/buildTimeTrend [5] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/buildTimeTrend [6] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/buildTimeTrend [7] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/buildTimeTrend [8] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/buildTimeTrend [9] https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/buildTimeTrend On Fri, May 4, 2018 at 2:34 AM Etienne Chauchot <[email protected]> wrote: > Scott, thanks for that ! > > I only quickly looked at the ValidatesRunner tests that I wrote (you > modified none) and the ones that impact my ongoing work (metrics). > I think some tests in MetricsTest still need to be ValidatesRunner tests. > See my comment in the review. > > Etienne > > Note: if you don't care about Java runner tests, you can stop reading now. > > tl;dr: I've made a pass over all @ValidatesRunner tests in pr/5218 [1] and > converted many to @NeedsRunner in order to reduce post-commit runtime. > > This is work that was long overdue and finally got my attention due to the > Gradle migration. As context, @ValidatesRunner [2] tests construct a > TestPipeline and exercise runner behavior through SDK constructs. The tests > are written runner-agnostic so that they can be run on and validate all > supported runners. > > The framework for these tests is great and writing them is super-easy. But > as a result, we have way too many of them-- over 250. These tests run > against all runners, and even when parallelized we see Dataflow post-commit > times exceeding 3-5 hours [3]. > > When reading through these tests, we found many of them don't actually > exercise runner-specific behavior, and were simply using the TestPipeline > framework to validate SDK components. This is a valid pattern, but tests > should be annotated with @NeedsRunner instead. With this annotation, the > tests will run on only a single runner, currently DirectRunner. > > So, PR/5218 looks at all existing @ValidatesRunner tests and > conservatively converts tests which don't need to validate all runners into > @NeedsRunner. I've also sharded out some very large test classes into > scenario-based sub-classes. This is because Gradle parallelizes tests at > the class-level, and we found a couple very large test classes (ParDoTest) > became stragglers for the entire execution. Hopefully Gradle will soon > implement dynamic splitting :) > > So, the action I'd like to request from others: > 1) If you are an author of @ValidatesRunner tests, feel free to look over > the PR and let me know if I missed anything. Kenn Knowles is also helping > out here. > 2) If you find yourself writing new @ValidatesRunner tests, please > consider whether your test is validating runner-provided behavior. If not, > use @NeedsRunner instead. > > > [1] https://github.com/apache/beam/pull/5218 > [2] > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/ValidatesRunner.java > > [3] > https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/buildTimeTrend > > >
