[
https://issues.apache.org/jira/browse/HDDS-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-5644:
---------------------------------
Labels: pull-request-available (was: )
> Speed up decommission tests using a background Mini Cluster provider
> --------------------------------------------------------------------
>
> Key: HDDS-5644
> URL: https://issues.apache.org/jira/browse/HDDS-5644
> Project: Apache Ozone
> Issue Type: Improvement
> Components: SCM
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
>
> The integration (ozone) test suit is the slowest part of the github actions
> build, taking over 2 hours usually. In a random PR I checked, 2hr16.
> Often in integration tests, a large part of the test time is spent creating a
> new mini-Ozone cluster for each test, which can take 10 - 20 seconds to
> startup.
> I also timed stopping a mini-cluster and found that can take up to 10 seconds.
> Changing the tests to reuse the same cluster can be difficult and make the
> tests less standalone and more brittle, which is not a good thing. Changing
> the tests is also time consuming work.
> Assuming a test runs for longer than the time taken to setup a mini-cluster
> and stop it, it would make the tests faster if we pre-created a mini-cluster
> in the background. Then when one test completes, the next cluster is already
> there, saving the startup time. Obviously this costs more concurrent cpu to
> reduce the wall clock time.
> We could also queue the shutdown of the clusters in another background thread.
> The slowest part of the Integration (Ozone) test suit are the decommission
> tests, taking 843 seconds on the last run I checked.
> This PR adds a Mini-Cluster provider to the Decommission tests as an
> experiment to see if it makes the runtime significantly faster in practice.
> If it does, this may be something we can roll out across other integration
> tests.
> As a baseline, I ran the decommission tests on my laptop, and it took 8min
> 37s.
> After the changes in this PR, the test suit ran in 3min 53s.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]