Hi Niels,

which version of Flink are you using? Currently, Flink does not support to
upgrade the TypeSerializer itself, if I'm not mistaken. As you've
described, it will try to use the old serializer stored in the checkpoint
stream to restore state.

I've pulled Gordon into the conversation who can tell you a little bit more
about the current capability and limitations of state evolution.


On Mon, Feb 19, 2018 at 4:14 PM, Niels <nielsdenis...@gmail.com> wrote:

> Hi all,
> I'm currently trying to use Avro in order to evolve our data present in
> Flink's Managed State. I've extended the TypeSerializer class successfully
> for this purpose, but still have issues using Schema Evolution.
> *The problem:*
> When we try to read data (deserialize from savepoint) with a new serialiser
> and a new schema, Flink seems to use the old schema of the old serializer
> (written to the savepoint). This results in an old GenericRecord that
> doesn't adhere to the new Avro schema.
> *What seems to happen to me is the following* (Say we evolve from dataV1 to
> dataV2):
> - State containing dataV1 is serialized with avro schema V1 to a
> check/savepoint. Along with the data, the serializer itself is written.
> - Upon restore, the old serializer is retrieved from the data (therefore
> needs to be on the classpath). Data is restored using this old serializer.
> The new serializer provided is only used for writes.
> If this is indeed the case it explains our aforementioned problem. If you
> have any pointers as to whether this is true and what a possible solution
> would be that would be very much appreciated!
> Thanks!
> Niels
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/

Reply via email to