[ 
https://issues.apache.org/jira/browse/HDDS-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell resolved HDDS-5644.
-------------------------------------
    Fix Version/s: 1.2.0
       Resolution: Fixed

> Speed up decommission tests using a background Mini Cluster provider
> --------------------------------------------------------------------
>
>                 Key: HDDS-5644
>                 URL: https://issues.apache.org/jira/browse/HDDS-5644
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: SCM
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.2.0
>
>
> The integration (ozone) test suit is the slowest part of the github actions 
> build, taking over 2 hours usually. In a random PR I checked, 2hr16.
> Often in integration tests, a large part of the test time is spent creating a 
> new mini-Ozone cluster for each test, which can take 10 - 20 seconds to 
> startup.
> I also timed stopping a mini-cluster and found that can take up to 10 seconds.
> Changing the tests to reuse the same cluster can be difficult and make the 
> tests less standalone and more brittle, which is not a good thing. Changing 
> the tests is also time consuming work.
> Assuming a test runs for longer than the time taken to setup a mini-cluster 
> and stop it, it would make the tests faster if we pre-created a mini-cluster 
> in the background. Then when one test completes, the next cluster is already 
> there, saving the startup time. Obviously this costs more concurrent cpu to 
> reduce the wall clock time.
> We could also queue the shutdown of the clusters in another background thread.
> The slowest part of the Integration (Ozone) test suit are the decommission 
> tests, taking 843 seconds on the last run I checked.
> This PR adds a Mini-Cluster provider to the Decommission tests as an 
> experiment to see if it makes the runtime significantly faster in practice. 
> If it does, this may be something we can roll out across other integration 
> tests.
> As a baseline, I ran the decommission tests on my laptop, and it took 8min 
> 37s.
> After the changes in this PR, the test suit ran in 3min 53s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to