[ 
https://issues.apache.org/jira/browse/FLINK-33466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783896#comment-17783896
 ] 

Jonas Weile edited comment on FLINK-33466 at 11/8/23 6:49 AM:
--------------------------------------------------------------

See the following issues for the same problem:

FLINK-31006

FLINK-29674 Apache Kafka Connector‘s “ setBounded” not valid - ASF Jira

It seems, that the two possible solutions are as follows:

1. Either the sources must signal the NoMoreSplitsEvent again after restoring 
from a snapshot or
2. the SourceReaderBase should snapshot the noMoreSplitsAssignment variable.

Which approach would be most viable?


was (Author: JIRAUSER302117):
See the following issues for the same problem:

[FLINK-31006] job is not finished when using pipeline mode to run bounded 
source like kafka/pulsar - ASF JIRA (apache.org)

[FLINK-29674] Apache Kafka Connector‘s “ setBounded” not valid - ASF Jira

It seems, that the two possible solutions are as follows:

1. Either the sources must signal the NoMoreSplitsEvent again after restoring 
from a snapshot or
2. the SourceReaderBase should snapshot the noMoreSplitsAssignment variable.

Which approach would be most viable?

> Bounded Kafka source never finishes after restore from savepoint
> ----------------------------------------------------------------
>
>                 Key: FLINK-33466
>                 URL: https://issues.apache.org/jira/browse/FLINK-33466
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Common
>    Affects Versions: 1.17.1
>            Reporter: Jonas Weile
>            Priority: Major
>
> When setting up a bounded Kafka source, if the job is restored from a 
> savepoint before the source has finished, then the Kafka source will never 
> transition to a finished state.
> This seems to be because the noMoreSplitsAssignment variable in the 
> SourceReaderBase class is not snapshotted. Therefore, after restoring from a 
> checkpoint/savepoint, the noMoreSplitsAssignment variable will default to 
> false, and the first condition in the private helper method 
> finishedOrAvailableLater() in the SourceReaderBase class will always evaluate 
> to true.
> Since this originates in the base class, the problem should hold for all 
> source types, not just kafka.
>  
> Would it make sense to snapshot the noMoreSplitsAssigntments variable?
> I would love to take this on as a first task if appropriate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to