maytasm opened a new pull request #11440: URL: https://github.com/apache/druid/pull/11440
Improve Auto scaler pendingTaskBased provisioning strategy to handle when there are no currently running worker node better ### Description As described in https://github.com/apache/druid/issues/10918, the PendingTaskBasedWorkerProvisioningStrategy of the Auto scaler does not work well when there are 0 worker node running. The problems are the following: 1. When there are 0 worker node running, currently the auto scaler will first scale up to minWorkerCount and only in the next provisioning cycle would be able to determine the correct number of workers needed to run all pending tasks. This is inefficient as we will have to go through two provisioning cycle plus the time it takes for the worker nodes in the first provisioning to be up and running before being able to scale to the correct number (basically it would take twice as long as needed) 2. When the minWorkerCount is set to 0 and there are 0 worker node running, the autoscaler will never attempts to add more instances. This is because the auto scaler will try to scale to minWorkerCount (which is 0). Hence, pending task will not be able to run. The reason for the auto scaler scaling to minWorkerCount first is because without any running worker node, the auto scaler will not be able to determine the capacity per worker. (note even when there are running worker nodes, that the auto scaler assume that all worker nodes have the same capacity and use the capacity of the first running node). To fix this problem, I introduce a new config in the PendingTaskBasedWorkerProvisioningConfig under `druid.indexer.autoscale.workerCapacityFallback`. This config tells the auto scaler the worker capcity for determining number of workers needed when there are currently no worker running. If unset or null, auto scaler will scale to `minNumWorkers` in autoScaler config instead. Note: this config is only applicable to `pendingTaskBased` provisioning strategy. Even if this config value is not accurate (i.e. if your worker node capacity changed over time) it is still useful for solving problem #2 above, as the auto scaler will at least provision some nodes and in the next providing cycle will be able to determine the correct number of workers needed (rather than being stuck at 0 workers forever). This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [x] added documentation for new or modified features or behaviors. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
