mghosh4 opened a new issue #10378:
URL: https://github.com/apache/druid/issues/10378
### Motivation
The primary motivation of this work is to provide more visibility into the
worker utilization over time. Monitoring utilization can help cluster
administrators determine when to add/remove workers from the pool. With native
ingestion adoption, this has become even more important.
### Proposed changes
I propose to add a new `WorkerCountStatsMonitor` in Overlord similar to
`TaskCountStatsMonitor` class. It will expose the following metrics:
- `worker/total/count`: Total number of workers
- `worker/idle/count`: Total number of workers available for adding tasks
- `worker/used/count`: Total number of workers being currently used
- `worker/lazy/count`: Total number of workers that have been marked lazy
- `worker/blacklisted/count`: Total number of workers that have been
blacklisted
### Proposed Design
I am planning to add the following apis in `TaskRunner`
```
/**
* APIs useful for emitting statistics for @WorkerCountStatsMonitor
*/
long getTotalWorkerCount();
long getIdleWorkerCount();
long getUsedWorkerCount();
long getLazyWorkerCount();
long getBlacklistedWorkerCount();
```
The implementation for `WorkerCountStatsMonitor` will be similar to
`TaskCountStatsMonitor`. It will use the `WorkerCountStatsProvider` interface
which will be implemented by `TaskMaster`. `TaskMaster` will use `taskRunner`
to emit the required metrics
### Operational impact
As such this change does not have any operational impact. We are adding some
new metrics for better cluster monitoring.
### Test plan
Other than unit tests, we are also planning to test this in our local Druid
clusters. We will be using the statsd-based emitter framework to collect all
the new emitted metrics for visualization.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]