[
https://issues.apache.org/jira/browse/FLINK-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900380#comment-14900380
]
ASF GitHub Bot commented on FLINK-2713:
---------------------------------------
Github user senorcarbone commented on the pull request:
https://github.com/apache/flink/pull/1150#issuecomment-141907506
At first I found it a bit odd to include serialized checkpointers on every
single statehandle but on a second look maybe that is the only way to
generalize operator states. The main problem is that StreamOperatorState can
be defined dynamically during runtime and thus, we need to allow dynamic
checkpointers along with the operator states and include them in the state
handles.
An alternative take which is slightly more restrictive is to enforce the
user to pre-define all mappings from custom operator state names to
checkpointers so we can configure these in the tasks themselves (kept in the
execution graph once) instead of including them on each state handle.
Apart from this concern the PR is well tested and documented!
Any other opinions?
> Custom StateCheckpointers should be included in the snapshots
> -------------------------------------------------------------
>
> Key: FLINK-2713
> URL: https://issues.apache.org/jira/browse/FLINK-2713
> Project: Flink
> Issue Type: Bug
> Components: Streaming
> Reporter: Gyula Fora
> Assignee: Gyula Fora
>
> Currently the restoreInitialState call fails when the user uses a custom
> StateCheckpointer to create the snapshot, because the state is restored
> before the StateCheckpointer is set for the StreamOperatorState. (because the
> restoreInitialState() call precedes the open() call)
> To avoid this issue, the custom StateCheckpointer instance should be stored
> within the snapshot and should be set in the StreamOperatorState before
> calling restoreState(..).
> To reduce the overhead induced by this we can do 2 optimizations:
> - We only include custom StateCheckpointers (the default java serializer one
> is always available)
> - We only serialize the checkpointer once and store the byte array in the
> snapshot
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)