JingsongLi commented on issue #1438:
URL: https://github.com/apache/iceberg/issues/1438#issuecomment-693144323


   ## State compatibility
   Let's focus @rdblue 's concern first: where Iceberg is upgraded and can no 
longer read older state.
   Currently, as @stevenzwu said, only the `DataFile` is in Flink state. And it 
is serialized by Kryo. 
   Although Kryo can avoid `serialVersionUID`, adding or reducing fields will 
also cause compatibility problems.
   
   Here I suggest that we need to implement Flink serializer for `DataFile`.
   
https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/state/custom_serialization.html#state-serializers-and-schema-evolution
   When restoring from savepoints, Flink allows changing the serializers used 
to read and write previously registered state, so that users are not locked in 
to any specific serialization schema.
   
   ## Job Topology compatibility
   This means we want to upgrade an existing Flink Job to a new Iceberg 
version. As @stevenzwu said, premise is that we have different versions of 
binaries for a Flink cluster. I don't think this is very important, but, if we 
want to solve it, we can define custom serializer for all the Iceberg classes 
as the final members for `RowDataTaskWriterFactory`. But it's going to be a 
little cumbersome. @stevenzwu 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to