tejaswini-imply opened a new pull request, #13144:
URL: https://github.com/apache/druid/pull/13144
### Description
When a stream becomes inactive i.e. no new data arrives and all existing
data is caught up, there is no need for Supervisor to create new indexing
tasks. This feature uses `LagBasedAutoScaler` to turn Supervisor idle when
stream is inactive for configured amount of time (See conf section).
- LagBasedAutoScaler submits either `IdleNotice` or
`DynamicAllocationTasksNotice` based on activity in the stream every
`LagBasedAutoScalerConfig.scaleActionPeriodMillis`.
- When Supervisor is in idle state it just doesn't create any new indexing
tasks.
- This temporary IDLE state is cleared when supervisor is restarted e.g.
updated spec.
- While the supervisor is in the temporary idle state, the lag-based
autoscaler will continue to run, checking for any changes in the latest offsets
(or if lag is > 0). If activity in the stream is detected again, then the
autoscaler will submit another notice to move the supervisor out of its idle
state and resume creating indexing tasks.
Please note the initial delay before new data is ingested from when new data
enters into the stream is frequency of idle/dynamic task allocation notice
submission (default - 1 min) + initial warmup time to create new indexing tasks.
#### How current offsets fetching work:
- Supervisor is not idle: current offsets are fetched from running tasks.
- Supervisor is idle: current offsets are fetched from metadata storage
since no tasks are likely running.
### Configuration changes
- `druid.supervisor.enableIdleBehaviour` overlord property enables this
feature. Default value is false.
- `LagBasedAutoScalerConfig.minPauseSupervisorIfStreamIdleMillis` - Minimum
time interval to wait until stream is considered inactive/idle. Default value
is 60,000 millis.
- Adds new State - `IDLE` - to Basic states of Supervisor.
<hr>
##### Key changed/added classes in this PR
* `SeekableStreamSupervisor`
* `KafkaSupervisor`
* `KinesisSupervisor`
* `SupervisorStateManager`
* `LagBasedAutoScaler`
<hr>
This PR has:
- [ ] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]