[
https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680500#comment-14680500
]
ASF GitHub Bot commented on FLINK-2314:
---------------------------------------
Github user StephanEwen commented on the pull request:
https://github.com/apache/flink/pull/997#issuecomment-129556271
I agree with @chenliang613 , it would be great to have a brief description
how this mechanism is implemented.
I think there is also a problem still, because InputSplits are assigned
once again after the dataflow graph (ExecutionGraph) is restarted. Since their
assignment is lazy (to better load balance), they may get distributed in a
different way on the re-try, compared to the original execution.
To fix this, the input split assignment would need to be part of the
checkpoints as well.
> Make Streaming File Sources Persistent
> --------------------------------------
>
> Key: FLINK-2314
> URL: https://issues.apache.org/jira/browse/FLINK-2314
> Project: Flink
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 0.9
> Reporter: Stephan Ewen
> Assignee: Sheetal Parade
> Labels: easyfix, starter
>
> Streaming File sources should participate in the checkpointing. They should
> track the bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an example of a
> checkpointed source.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)