Evolving serializers and impact on flink managed states

Biplob Biswas Wed, 09 Aug 2017 08:20:52 -0700

Hi, 

We have a set of XSDs which define our schema for the data and we are using
the corresponding POJO's for serialization with Flink Serialization stack.
Now, I was concerned about any evolution of our XSD schema which will
subsequently change the generated POJO's which in turn are used for creating
serdes. Also what is concerning to me is the corresponding behaviour of the
managed states(as they would be serialized using serializers defined over
old/new POJO's).

In that regard, I read about Handling serializer upgrades and
compatitbility
<https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html#handling-serializer-upgrades-and-compatibility>

and found out that there is a plan to do exactly that which would involve
state migration such that old data would be read from old serializer and
then serialized back with the new serializer.

Now I have a few questions regarding the same.

1. The article in question probably makes use of Flink serialization, what
if I use Avro serde for the serialization and deserialization part. If I
create a savepoint of my job, stop my flink, load the new POJO and continue
from the savepoint, would avro's schema evolution feature perform the
transition smoothly?
For example, a new entity is inserted, all the old values would get a
default value for which there is no value available and when an entity is
deleted, that value is simply dropped?

2. If yes, how would this play out in the flink ecosystem, and if not, would
the flink serialization upgrades in the future handle such cases(forward and
backward compatibility)?

3. Are managed state also stored and reloaded, when savepoints are created
and used for resuming a job?

4. When can one expect to have the state migration feature in Flink? In
1.4.0?

Thanks & Regards,
Biplob

--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Evolving-serializers-and-impact-on-flink-managed-states-tp14777.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.

Evolving serializers and impact on flink managed states

Reply via email to