[GitHub] [druid] jihoonson commented on a change in pull request #11444: Improve documentation for druid.indexer.autoscale.workerCapacityHint config

GitBox Mon, 19 Jul 2021 16:55:47 -0700


jihoonson commented on a change in pull request #11444:
URL: https://github.com/apache/druid/pull/11444#discussion_r672709947




##########
File path: docs/configuration/index.md
##########
@@ -1015,7 +1015,7 @@ There are additional configs for autoscaling (if it is 
enabled):
 |`druid.indexer.autoscale.pendingTaskTimeout`|How long a task can be in 
"pending" state before the Overlord tries to scale up.|PT30S|
 |`druid.indexer.autoscale.workerVersion`|If set, will only create nodes of set 
version during autoscaling. Overrides dynamic configuration. |null|
 |`druid.indexer.autoscale.workerPort`|The port that MiddleManagers will run 
on.|8080|
-|`druid.indexer.autoscale.workerCapacityHint`| Worker capacity for determining 
the number of workers needed for auto scaling when there is currently no worker 
running. If unset or set to value of 0 or less, auto scaler will scale to 
`minNumWorkers` in autoScaler config instead. This value should typically be 
equal to `druid.worker.capacity` when you have a homogeneous cluster and the 
average of `druid.worker.capacity` across the workers when you have a 
heterogeneous cluster. Note: this config is only applicable to 
`pendingTaskBased` provisioning strategy|-1|
+|`druid.indexer.autoscale.workerCapacityHint`| Worker capacity for determining 
the number of workers needed for auto scaling when there is currently no worker 
running. If unset or set to value of 0 or less, auto scaler will scale to 
`minNumWorkers` in autoScaler config instead. This value should typically be 
equal to `druid.worker.capacity` when your workers have a homogeneous capacity 
and the average of `druid.worker.capacity` across the workers when your workers 
have a heterogeneous capacity. Note: this config is only applicable to 
`pendingTaskBased` provisioning strategy|-1|

Review comment:
       @techdocsmith hmm, maybe this config needs more detailed description 
about how auto scaler works. When there are ingestion jobs pending, the auto 
scaler first computes how many new nodes are required to unblock those pending 
tasks. Since each worker (middleManager or indexer) can run more than one task 
at the same time (depending on `druid.worker.capacity`), the number of new 
nodes to spin up is roughly `ceil(pending task count / worker capacity)`. The 
problem is, the auto scaler runs on the overlord and is not aware of 
`druid.worker.capacity`. Also, each worker can have a different value set to 
`druid.worker.capacity` in a heterogeneous cluster. As a result, the auto 
scaler is currently detecting the worker capacity from workers. However, this 
cannot work when there is no workers running. This PR works around this issue 
by adding a new config, `workerCapacityHint`, which can be used as a hint for 
auto scaler to compute the number of new workers to spin up even when there is n
 o workers running. So, to answer your questions,
   
   > "Worker capacity for determining the number of new workers". If i set it 
to 5, does the autoscaler spin up 5 workers? If so it's just the number of new 
workers.
   
   It depends on how many pending tasks you have. If you have 25 pending tasks, 
then yes it will spin up 5 new workers. If you have 8 pending tasks, the auto 
scaler will spin up 2 new workers because it is hinted that each worker has 5 
task slots.
   
   > Also If I set it to 5, does each worker thn need to have 5 task slots too? 
I wasn't sure of that relationship: " Each worker (middleManager or indexer) is 
assumed to have this amount of task slots."
   
   I think the relationship is opposite direction. When each worker has 5 task 
slots, then you need to set this config to 5 so that the auto scaler can 
correctly estimate the number of new workers needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] jihoonson commented on a change in pull request #11444: Improve documentation for druid.indexer.autoscale.workerCapacityHint config

Reply via email to