[
https://issues.apache.org/jira/browse/BEAM-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719435#comment-16719435
]
Thomas Weise commented on BEAM-5396:
------------------------------------
The pipeline option to specify savepoint restore is now in place and I was able
to verify savepoint creation and launch from savepoint for a simple pipeline
(the synthetic source example). With a more complex graph, I observed issues
restoring the (unmodified) pipeline:
{code:java}
Failed to rollback to checkpoint/savepoint
s3://flink/savepoints/savepoint-57d263-5bc4a520fa08. Cannot map
checkpoint/savepoint state for operator 8bda33844c6e43c3df560a8701a3fa81 to the
new program, because the operator is not available in the new program. If you
want to allow to skip this, you can set the --allowNonRestoredState option on
the CLI.{code}
That may point to either unstable graph generation or the need to consistently
assign UIDs to all operators during translation:
[https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html]
> Flink portable runner savepoint / upgrade support
> -------------------------------------------------
>
> Key: BEAM-5396
> URL: https://issues.apache.org/jira/browse/BEAM-5396
> Project: Beam
> Issue Type: Improvement
> Components: runner-flink
> Reporter: Thomas Weise
> Assignee: Thomas Weise
> Priority: Major
> Labels: portability, portability-flink
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> The portable Flink runner needs to support Flink savepoints for production
> use. It should be possible to upgrade a stateful portable Beam pipeline that
> runs on Flink, which involves taking a savepoint and then starting the new
> version of the pipeline from that savepoint. The potential issues with
> pipeline evolution and migration are similar to those when using the Flink
> DataStream API (schema / name changes etc.).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)