Hi,

I'm looking into updating the Flink Runner to Flink version 1.8. Since version 1.7 Flink has a new optional interface for Coder evolution*.

When a Flink pipeline is checkpointed, CoderSnapshots are written out alongside with the checkpointed data. When the pipeline is restored from that checkpoint, the CoderSnapshots are restored and used to reinstantiate the Coders.

Furthermore, there is a compatibility and migration check between the old and the new Coder. This allows to determine whether

 - The serializer did not change or is compatible (ok)
 - The serialization format of the coder changed (ok after migration)
 - The coder needs to be reconfigured and we know how to that based on
   the old version (ok after reconfiguration)
 - The coder is incompatible (error)

I was wondering about the Coder evolution story in Beam. The current state is that checkpointed Beam pipelines are only guaranteed to run with the same Beam version and pipeline version. A newer version of either might break the checkpoint format without any way to migrate the state.

Should we start thinking about supporting Coder evolution in Beam?

Thanks,
Max


* Coders are called TypeSerializers in Flink land. The interface is TypeSerializerSnapshot.

Reply via email to