[jira] [Commented] (FLINK-9376) Allow upgrading to incompatible state serializers (state schema evolution)

ASF GitHub Bot (JIRA) Fri, 13 Jul 2018 02:25:42 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-9376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542769#comment-16542769
 ]


ASF GitHub Bot commented on FLINK-9376:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/6325#discussion_r202285840
  
    --- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerConfigSnapshot.java
 ---
    @@ -99,7 +99,8 @@
                // TODO this method actually should not have a default 
implementation;
                // TODO this placeholder should be removed as soon as all 
subclasses have a proper implementation in place, and
                // TODO the method is properly integrated in state backends' 
restore procedures
    -           throw new UnsupportedOperationException();
    +//         throw new UnsupportedOperationException();
    --- End diff --
    
    Remove this line.


> Allow upgrading to incompatible state serializers (state schema evolution)
> --------------------------------------------------------------------------
>
>                 Key: FLINK-9376
>                 URL: https://issues.apache.org/jira/browse/FLINK-9376
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing, Type Serialization System
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.6.0
>
>
> Currently, users have access to upgrade state serializers on the restore run 
> of a stateful job, as long as the upgraded new serializer remains backwards 
> compatible with all previous written data in the savepoint (i.e. it can read 
> all previous and current schema of serialized state objects).
> What is still lacking is the ability to upgrade to incompatible serializers. 
> Upon being registered an incompatible serializer for existing restored state, 
> that state needs to go through the process of -
>  1. read serialized state with the previous serializer
>  2. passing each deserialized state object through a “migration map 
> function”, and
>  3. writing back the state with the new serializer
> The availability of this process should be strictly limited to state 
> registrations that occur before the actual processing begins (e.g. in the 
> {{open}} or {{initializeState}} methods), so that we avoid performing these 
> operations during processing.
> How this procedure actually occurs, differs across different types of state 
> backends.
> For example, for state backends that eagerly deserialize / lazily serialize 
> state (e.g. {{HeapStateBackend}}), the job execution itself can be seen as a 
> "migration"; everything is deserialized to state objects on restore, and is 
> only serialized again, with the new serializer, on checkpoints.
> Therefore, for these state backends, the above process is irrelevant.
> On the other hand, for state backends that lazily deserialize / eagerly 
> serialize state (e.g. {{RocksDBStateBackend}}), the state evolution process 
> needs to happen for every state with a newly registered incompatible 
> serializer.
> Procedure 2. will allow even state type migrations, but that is out-of-scope 
> of this JIRA.
>  This ticket focuses only on procedures 1. and 3., where we try to enable 
> schema evolution without state type changes.
> This is an umbrella JIRA ticket that overlooks this feature, including a few 
> preliminary tasks that work towards enabling it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9376) Allow upgrading to incompatible state serializers (state schema evolution)

Reply via email to