[ 
https://issues.apache.org/jira/browse/FLINK-38989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Xu updated FLINK-38989:
-----------------------------
    Description: 
The backlog flag currently flips from `true` to `false` only once. In certain 
scenarios—e.g., when the source doesn’t use checkpointing—the source will 
re-read the full dataset every time a TaskManager registers with the 
JobManager. In such cases, the backlog state should be re-entrant, rather than 
a one-time, state-driven transition.

And we have a concrete optimization for lookup joins that will rely on this 
capability, and we’ll follow up with a FLIP to describe the optimization in 
detail.

The main API changes are shown in the following code snippet.

{code:java}
@Public
public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
        extends AutoCloseable, CheckpointListener {

    -- newly added method
    /**
     * Triggers resetting the backlog in the split enumerator when a {@code 
ResetBacklogEvent} is
     * received. This requires the source enumerator in the connector to 
re-consume the backlog
     * data, similar to the initial startup process.
     *
     * <p>The default implementation is a no-op.
     */
    default void resetBacklog() {}

}
{code}


  was:
The backlog flag currently flips from `true` to `false` only once. In certain 
scenarios—e.g., when the source doesn’t use checkpointing—the source will 
re-read the full dataset every time a TaskManager registers with the 
JobManager. In such cases, the backlog state should be re-entrant, rather than 
a one-time, state-driven transition.

And we have a concrete optimization for lookup joins that will rely on this 
capability, and we’ll follow up with a FLIP to describe the optimization in 
detail.


> Support resetting backlog status
> --------------------------------
>
>                 Key: FLINK-38989
>                 URL: https://issues.apache.org/jira/browse/FLINK-38989
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Shuai Xu
>            Priority: Major
>
> The backlog flag currently flips from `true` to `false` only once. In certain 
> scenarios—e.g., when the source doesn’t use checkpointing—the source will 
> re-read the full dataset every time a TaskManager registers with the 
> JobManager. In such cases, the backlog state should be re-entrant, rather 
> than a one-time, state-driven transition.
> And we have a concrete optimization for lookup joins that will rely on this 
> capability, and we’ll follow up with a FLIP to describe the optimization in 
> detail.
> The main API changes are shown in the following code snippet.
> {code:java}
> @Public
> public interface SplitEnumerator<SplitT extends SourceSplit, CheckpointT>
>         extends AutoCloseable, CheckpointListener {
>     -- newly added method
>     /**
>      * Triggers resetting the backlog in the split enumerator when a {@code 
> ResetBacklogEvent} is
>      * received. This requires the source enumerator in the connector to 
> re-consume the backlog
>      * data, similar to the initial startup process.
>      *
>      * <p>The default implementation is a no-op.
>      */
>     default void resetBacklog() {}
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to