There was a thread about coder update in the past here[1]. Also, Reuven sent out a doc[2] about pipeline drain and update which was discussed in this thread[3]. I believe there have been more references to pipeline update in other threads when people tried to change coder encodings in the past as well.
Reuven/Dan are the best contacts about this on how this works inside of Google, the limitations and other ideas that had been proposed. 1: https://lists.apache.org/thread.html/f3b2daa740075cc39dc04f08d1eaacfcc2d550cecc04e27be024cf52@%3Cdev.beam.apache.org%3E 2: https://docs.google.com/document/d/1NExwHlj-2q2WUGhSO4jTu8XGhDPmm3cllSN8IMmWci8/edit# 3: https://lists.apache.org/thread.html/37c9aa4aa5011801a7f060bf3f53e687e539fa6154cc9f1c544d4f7a@%3Cdev.beam.apache.org%3E On Wed, May 8, 2019 at 11:45 AM Maximilian Michels <[email protected]> wrote: > Hi, > > I'm looking into updating the Flink Runner to Flink version 1.8. Since > version 1.7 Flink has a new optional interface for Coder evolution*. > > When a Flink pipeline is checkpointed, CoderSnapshots are written out > alongside with the checkpointed data. When the pipeline is restored from > that checkpoint, the CoderSnapshots are restored and used to > reinstantiate the Coders. > > Furthermore, there is a compatibility and migration check between the > old and the new Coder. This allows to determine whether > > - The serializer did not change or is compatible (ok) > - The serialization format of the coder changed (ok after migration) > - The coder needs to be reconfigured and we know how to that based on > the old version (ok after reconfiguration) > - The coder is incompatible (error) > > I was wondering about the Coder evolution story in Beam. The current > state is that checkpointed Beam pipelines are only guaranteed to run > with the same Beam version and pipeline version. A newer version of > either might break the checkpoint format without any way to migrate the > state. > > Should we start thinking about supporting Coder evolution in Beam? > > Thanks, > Max > > > * Coders are called TypeSerializers in Flink land. The interface is > TypeSerializerSnapshot. >
