[PR] Reset offsets and backfill (druid)

via GitHub Fri, 20 Mar 2026 14:47:22 -0700


aho135 opened a new pull request, #19191:
URL: https://github.com/apache/druid/pull/19191

This PR is still a WIP. I will be adding additional test coverage for this
For the initial implementation I've only done Kafka. I plan to implement
Kinesis and Rabbit in subsequent PR's, followed by documentation for the new
feature.

This change introduces a new parameter called `backfill` to the
SupervisorResource reset endpoint. If left unset the current reset behavior
remains unchanged. When set to true, after the reset is performed, new
ingestion tasks will be spun up to consume the skipped data. This is a useful
feature for operating Druid clusters where the most recent data is the most
important (such as alerting use cases).

<!-- If you are a committer, follow the PR action item checklist for
committers:

https://github.com/apache/druid/blob/master/dev/committer-instructions.md#pr-and-issue-action-item-checklist-for-committers.
-->

### Description

Adds a parameter called `backfill` to the Supervisor reset endpoint to
automatically ingest skipped data in the case where the offset is reset to
latest. This requires useEarliestOffset=false, useConcurrentLocks=true because
there can be conflicting time intervals between the backfill task and the main
supervisor tasks, useTransaction=false in order to disable metadata updates,
and the Supervisor needs to be in a running state in order to call
`updatePartitionLagFromStream()` to get the latest offsets

In addition, this change supports unsupervised SeekableStreamIndexTasks by
setting `supervised: false` in the SeekableStreamIndexTaskIOConfig. The
backfill tasks are one-off task submissions so this new flag disables any
checkpointing (refer to changes in SeekableStreamIndexTaskRunner)

The number of backfill tasks can be controlled through `backfillTaskCount`
in the Supervisor spec, and defaults to taskCount / 2

#### Release note
Adds an optional parameter to the Supervisor reset endpoint to backfill the
skipped data when the stream is reset to latest
<!-- Give your best effort to summarize your changes in a couple of
sentences aimed toward Druid users.

If your change doesn't have end user impact, you can skip this section.

For tips about how to write a good release note, see [Release
notes](https://github.com/apache/druid/blob/master/CONTRIBUTING.md#release-notes).

-->

<hr>

##### Key changed/added classes in this PR
* `SupervisorResource`
* `SupervisorManager`
* `SeekableStreamSupervisor`
* `KafkaSupervisor`
<hr>

This PR has:

- [X] been self-reviewed.
- [X] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors. [I plan
to add documentation for this feature in a follow-up after Rabbit/Kinesis are
implemented]
- [X] a release note entry in the PR description.
- [X] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [X] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [X] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [X] been tested in a test Druid cluster.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Reset offsets and backfill (druid)

Reply via email to