Scott Wegner created BEAM-4100:
----------------------------------

             Summary: Dataflow ValidatesRunner PostCommits are nearing the 3 
hour max runtime
                 Key: BEAM-4100
                 URL: https://issues.apache.org/jira/browse/BEAM-4100
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
            Reporter: Scott Wegner


The beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle test suite been 
getting slower and slower over time. We run over 250 pipelines, and Dataflow 
has a fixed cost of about 3 minutes per pipeline just to spin up VM's. And with 
Gradle we are not able to parallelize execution quite as well, putting the test 
suite very close to the 3 hour limit.

We should take steps to decrease the overall execution time for these 
post-commits. Some ideas, roughly in order of recommended action:
 # Convert some ValidatesRunner tests to NeedsRunner. ValidatesRunner should be 
used when the test needs to validate functionality of each runner. I suspect 
that many of these are either duplicate or only need to be run on a single 
runner.
 # Break up large test classes. Gradle parallelizes test classes, so the 
overall execution is constrained by the slowest test classes. ParDoTest for 
example runs over 50 pipelines, so the overall execution will always be > 50 * 
3min = 2h30m
 # Investigate how to achieve additional ValidatesRunner test parallelism. For 
example, we could:
 ## Build a custom JUnit runner which packs a set of pipeline graphs into a 
single job to execute together.
 ## Work with Dataflow for supporting a way to reuse VMs between jobs to 
decrease the overall cost.
 ## Work on the Gradle Java plugin to support parallelization at the test case 
level, similar to Maven Surefire.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to