[
https://issues.apache.org/jira/browse/FLINK-38989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shuai Xu updated FLINK-38989:
-----------------------------
Description:
The backlog flag currently flips from `true` to `false` only once. In certain
scenarios—e.g., when the source doesn’t use checkpointing—the source will
re-read the full dataset every time a TaskManager registers with the
JobManager. In such cases, the backlog state should be re-entrant, rather than
a one-time, state-driven transition.
And we have a concrete optimization for lookup join that will rely on this
capability, and we’ll follow up with a FLIP to describe the optimization in
detail.
The main API changes are shown in the following code snippet.
{code:java}
@Public
public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
extends AutoCloseable, CheckpointListener {
-- newly added method
/**
* Triggers resetting the backlog in the split enumerator when a {@code
ResetBacklogEvent} is
* received. This requires the source enumerator in the connector to
re-consume the backlog
* data, similar to the initial startup process.
*
* <p>The default implementation is a no-op.
*/
default void resetBacklog() {}
}
{code}
was:
The backlog flag currently flips from `true` to `false` only once. In certain
scenarios—e.g., when the source doesn’t use checkpointing—the source will
re-read the full dataset every time a TaskManager registers with the
JobManager. In such cases, the backlog state should be re-entrant, rather than
a one-time, state-driven transition.
And we have a concrete optimization for lookup joins that will rely on this
capability, and we’ll follow up with a FLIP to describe the optimization in
detail.
The main API changes are shown in the following code snippet.
{code:java}
@Public
public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
extends AutoCloseable, CheckpointListener {
-- newly added method
/**
* Triggers resetting the backlog in the split enumerator when a {@code
ResetBacklogEvent} is
* received. This requires the source enumerator in the connector to
re-consume the backlog
* data, similar to the initial startup process.
*
* <p>The default implementation is a no-op.
*/
default void resetBacklog() {}
}
{code}
> Support resetting backlog status
> --------------------------------
>
> Key: FLINK-38989
> URL: https://issues.apache.org/jira/browse/FLINK-38989
> Project: Flink
> Issue Type: New Feature
> Reporter: Shuai Xu
> Priority: Major
>
> The backlog flag currently flips from `true` to `false` only once. In certain
> scenarios—e.g., when the source doesn’t use checkpointing—the source will
> re-read the full dataset every time a TaskManager registers with the
> JobManager. In such cases, the backlog state should be re-entrant, rather
> than a one-time, state-driven transition.
> And we have a concrete optimization for lookup join that will rely on this
> capability, and we’ll follow up with a FLIP to describe the optimization in
> detail.
> The main API changes are shown in the following code snippet.
> {code:java}
> @Public
> public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
> extends AutoCloseable, CheckpointListener {
> -- newly added method
> /**
> * Triggers resetting the backlog in the split enumerator when a {@code
> ResetBacklogEvent} is
> * received. This requires the source enumerator in the connector to
> re-consume the backlog
> * data, similar to the initial startup process.
> *
> * <p>The default implementation is a no-op.
> */
> default void resetBacklog() {}
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)