Re: ValidatesRunner test cleanup

Jean-Baptiste Onofré Thu, 03 May 2018 09:18:12 -0700

Thanks for the update Scott. That's really a great job.

I will ping you on slack about some points as I'm preparing the build for the 
release (and I have some issues 😁).


Thanks again
Regards
JB

Le 3 mai 2018 à 17:54, à 17:54, Scott Wegner <[email protected]> a écrit:
>Note: if you don't care about Java runner tests, you can stop reading
>now.
>
>tl;dr: I've made a pass over all @ValidatesRunner tests in pr/5218 [1]
>and
>converted many to @NeedsRunner in order to reduce post-commit runtime.
>
>This is work that was long overdue and finally got my attention due to
>the
>Gradle migration. As context, @ValidatesRunner [2] tests construct a
>TestPipeline and exercise runner behavior through SDK constructs. The
>tests
>are written runner-agnostic so that they can be run on and validate all
>supported runners.
>
>The framework for these tests is great and writing them is super-easy.
>But
>as a result, we have way too many of them-- over 250. These tests run
>against all runners, and even when parallelized we see Dataflow
>post-commit
>times exceeding 3-5 hours [3].
>
>When reading through these tests, we found many of them don't actually
>exercise runner-specific behavior, and were simply using the
>TestPipeline
>framework to validate SDK components. This is a valid pattern, but
>tests
>should be annotated with @NeedsRunner instead. With this annotation,
>the
>tests will run on only a single runner, currently DirectRunner.
>
>So, PR/5218 looks at all existing @ValidatesRunner tests and
>conservatively
>converts tests which don't need to validate all runners into
>@NeedsRunner.
>I've also sharded out some very large test classes into scenario-based
>sub-classes. This is because Gradle parallelizes tests at the
>class-level,
>and we found a couple very large test classes (ParDoTest) became
>stragglers
>for the entire execution. Hopefully Gradle will soon implement dynamic
>splitting :)
>
>So, the action I'd like to request from others:
>1) If you are an author of @ValidatesRunner tests, feel free to look
>over
>the PR and let me know if I missed anything. Kenn Knowles is also
>helping
>out here.
>2) If you find yourself writing new @ValidatesRunner tests, please
>consider
>whether your test is validating runner-provided behavior. If not, use
>@NeedsRunner instead.
>
>
>[1] https://github.com/apache/beam/pull/5218
>[2]
>https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/ValidatesRunner.java
>
>[3]
>https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/buildTimeTrend

Re: ValidatesRunner test cleanup

Reply via email to