vishesh92 opened a new pull request, #7723: URL: https://github.com/apache/cloudstack/pull/7723
Documentation PR - https://github.com/apache/cloudstack-documentation/pull/334 ### Description This pull request (PR) implements a Distributed Resource Scheduler (DRS) for a CloudStack cluster. The primary objective of this feature is to enable automatic resource optimization and workload balancing within the cluster by live migrating the VMs as per configuration. Administrators can also execute DRS manually for a cluster, using the UI or the API. Adds support for two algorithms - `condensed` & `balanced`. Algorithms are pluggable allowing ACS Administrators to have customized control over scheduling. **Implementation** There are three top level components: 1. **Scheduler** A timer task which: * Generate DRS plan for clusters * Process DRS plan * Remove old DRS plan records 2. **DRS Execution** We go through each VM in the cluster and use the specified algorithm to check if DRS is required and to calculate `cost`, `benefit` & `improvement` of migrating that VM to another host in the cluster. On the basis of `cost`, `benefit` & `improvement`, the best migration is selected for the current iteration and the VM is migrated. The maximum number of iterations (live migrations) possible on the cluster is defined by `drs.iterations` which is defined as a percentage (as a value between 0 and 1) of total number of workloads. 3. **Algorithm** Every algorithms implements two methods: 1. `needsDrs` - to check if drs is required for cluster 2. `getMetrics` - to calculate `cost`, `benefit` & `improvement` of a migrating a VM to another host. **Algorithms** 1. `Condensed` - Packs all the VMs on minimum number of hosts in the cluster. 2. `Balanced` - Distributes the VMs evenly across hosts in the cluster. Algorithms use `drs.level` to decide the amount of imbalance to allow in the cluster. #### APIs Added _listClusterDrsPlan_ > `id` - ID of the DRS plan to list > `clusterid` - to list plans for a cluster id _generateClusterDrsPlan_ >`id` - cluster id >`iterations` - The maximum number of iterations in a DRS job defined as a percentage (as a value between 0 and 1) of total number of workloads. Defaults to value of cluster's `drs.iterations` setting. _executeClusterDrsPlan_ > `id` - ID of the cluster for which DRS plan is to be executed. > `migrateto` - This parameter specifies the mapping between a vm and a host to migrate that VM. Format of this parameter: `migrateto[vm-index].vm=<uuid>&migrateto[vm-index].host=<uuid>`. #### Config Keys Added - ClusterDrsPlanExpireInterval **Key** `drs.plan.expire.interval` **Scope** `Global` **Default Value** `30` days **Description** The interval in days after which old DRS records will be cleaned up. - ClusterDrsEnabled **Key** `drs.automatic.enable` **Scope** `Cluster` **Default Value** `false` **Description** Enable/disable automatic DRS on a cluster. - ClusterDrsInterval **Key** `drs.automatic.interval` **Scope** `Cluster` **Default Value** `60` minutes **Description** The interval in minutes after which a periodic background thread will schedule DRS for a cluster. - ClusterDrsIterations **Key** `drs.max.migrations` **Scope** `Cluster` **Default Value** `50` **Description** Maximum number of live migrations in a DRS execution. - ClusterDrsAlgorithm **Key** `drs.algorithm` **Scope** `Cluster` **Default Value** `condensed` **Description** DRS algorithm to execute on the cluster. This PR implements two algorithms - `balanced` & `condensed`. - ClusterDrsLevel **Key** `drs.imbalance` **Scope** `Cluster` **Default Value** `0.5` **Description** Percentage (as a value between 0.0 and 1.0) of imbalance allowed in the cluster. 1.0 means no imbalance is allowed and 0.0 means imbalance is allowed. - ClusterDrsMetric **Key** `drs.imbalance.metric` **Scope** `Cluster` **Default Value** `memory` **Description** The cluster imbalance metric to use when checking the `drs.imbalance.threshold`. Possible values are `memory` and `cpu`. <!--- Describe your changes in DETAIL - And how has behaviour functionally changed. --> <!-- For new features, provide link to FS, dev ML discussion etc. --> <!-- In case of bug fix, the expected and actual behaviours, steps to reproduce. --> <!-- When "Fixes: #<id>" is specified, the issue/PR will automatically be closed when this PR gets merged --> <!-- For addressing multiple issues/PRs, use multiple "Fixes: #<id>" --> <!-- Fixes: # --> <!--- ********************************************************************************* --> <!--- NOTE: AUTOMATATION USES THE DESCRIPTIONS TO SET LABELS AND PRODUCE DOCUMENTATION. --> <!--- PLEASE PUT AN 'X' in only **ONE** box --> <!--- ********************************************************************************* --> ### Types of changes - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [x] New feature (non-breaking change which adds functionality) - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] Enhancement (improves an existing feature and functionality) - [ ] Cleanup (Code refactoring and cleanup, that may add test cases) ### Feature/Enhancement Scale or Bug Severity #### Feature/Enhancement Scale - [x] Major - [ ] Minor #### Bug Severity - [ ] BLOCKER - [ ] Critical - [ ] Major - [ ] Minor - [ ] Trivial ### Screenshots (if appropriate): ### How Has This Been Tested? <!-- Please describe in detail how you tested your changes. --> <!-- Include details of your testing environment, and the tests you ran to --> <!-- see how your change affects other areas of the code, etc. --> <!-- Please read the [CONTRIBUTING](https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md) document --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
