Till Rohrmann created FLINK-6763:
------------------------------------

             Summary: Inefficient PojoSerializerConfigSnapshot serialization 
format
                 Key: FLINK-6763
                 URL: https://issues.apache.org/jira/browse/FLINK-6763
             Project: Flink
          Issue Type: Improvement
          Components: State Backends, Checkpointing, Type Serialization System
    Affects Versions: 1.3.0, 1.4.0
            Reporter: Till Rohrmann


The {{PojoSerializerConfigSnapshot}} stores for each serializer the beginning 
offset and ending offset in the serialization stream. This information is also 
written if the serializer serialization is supposed to be ignored. The 
beginning and ending offsets are stored as a sequence of integers at the 
beginning of the serialization stream. We store this information to skip broken 
serializers.

I think we don't need both offsets. Instead I would suggest to write the length 
of the serialized serializer first into the serialization stream and then the 
serialized serializer. This can be done in 
{{TypeSerializerSerializationUtil.writeSerializer}}. When reading the 
serializer via {{TypeSerializerSerializationUtil.tryReadSerializer}}, we can 
try to deserialize the serializer. If this operation fails, then we can skip 
the number of serialized serializer because we know how long it was.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to