[
https://issues.apache.org/jira/browse/FLINK-38989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Xu updated FLINK-38989:
-----------------------------
Component/s: API / Core
Runtime / Coordination
(was: Runtime / Task)
> Support resetting backlog status
> --------------------------------
>
> Key: FLINK-38989
> URL: https://issues.apache.org/jira/browse/FLINK-38989
> Project: Flink
> Issue Type: New Feature
> Components: API / Core, Runtime / Coordination
> Reporter: Shuai Xu
> Priority: Major
>
> The backlog flag currently flips from `true` to `false` only once. In certain
> scenarios—e.g., when the source doesn’t use checkpointing—the source will
> re-read the full dataset every time a TaskManager registers with the
> JobManager. In such cases, the backlog state should be re-entrant, rather
> than a one-time, state-driven transition.
> And we have a concrete optimization for lookup join that will rely on this
> capability, and we’ll follow up with a FLIP to describe the optimization in
> detail.
> The main API changes are shown in the following code snippet.
> {code:java}
> @Public
> public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
> extends AutoCloseable, CheckpointListener {
> -- newly added method
> /**
> * Triggers resetting the backlog in the split enumerator when a {@code
> ResetBacklogEvent} is
> * received. This requires the source enumerator in the connector to
> re-consume the backlog
> * data, similar to the initial startup process.
> *
> * <p>The default implementation is a no-op.
> */
> default void resetBacklog() {}
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)