[ 
https://issues.apache.org/jira/browse/BEAM-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4100:
----------------------------------
    Fix Version/s: Not applicable
       Resolution: Fixed
           Status: Resolved  (was: Open)

> Dataflow ValidatesRunner PostCommits are nearing the 3 hour max runtime
> -----------------------------------------------------------------------
>
>                 Key: BEAM-4100
>                 URL: https://issues.apache.org/jira/browse/BEAM-4100
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Scott Wegner
>            Priority: P3
>             Fix For: Not applicable
>
>
> The beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle test suite been 
> getting slower and slower over time. We run over 250 pipelines, and Dataflow 
> has a fixed cost of about 3 minutes per pipeline just to spin up VM's. And 
> with Gradle we are not able to parallelize execution quite as well, putting 
> the test suite very close to the 3 hour limit.
> We should take steps to decrease the overall execution time for these 
> post-commits. Some ideas, roughly in order of recommended action:
>  # Convert some ValidatesRunner tests to NeedsRunner. ValidatesRunner should 
> be used when the test needs to validate functionality of each runner. I 
> suspect that many of these are either duplicate or only need to be run on a 
> single runner.
>  # Break up large test classes. Gradle parallelizes test classes, so the 
> overall execution is constrained by the slowest test classes. ParDoTest for 
> example runs over 50 pipelines, so the overall execution will always be > 50 
> * 3min = 2h30m
>  # Investigate how to achieve additional ValidatesRunner test parallelism. 
> For example, we could:
>  ## Build a custom JUnit runner which packs a set of pipeline graphs into a 
> single job to execute together.
>  ## Work with Dataflow for supporting a way to reuse VMs between jobs to 
> decrease the overall cost.
>  ## Work on the Gradle Java plugin to support parallelization at the test 
> case level, similar to Maven Surefire.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to