[ 
https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406943#comment-16406943
 ] 

Stephan Ewen commented on FLINK-8715:
-------------------------------------

[~aljoscha] I think you are right, the problem is potentially deeper.

All parts of the code that obtain a serializer from a state descriptor need to 
obey reconfiguration based on the serializer config snapshot. At least all 
parts that touch data serialized by earlier versions. That includes queryable 
state!

I think the best way to fix this (long run) would be to simply never have to 
use the old serializer again. If there is a change between the serializers, 
convert on restore, then the new serializer is safe to use. Everything else 
seems like a deep rabbit hole.

The only worthwhile exception to this is may be Kryo, when the tag-to-class 
mapping changes. Then we really want the new serializer to re-configure and not 
convert the entire data set.

If we really want to handle that, then we would indeed have to either
  - reconfigure the state descriptors
  - or never/nowhere use the state descriptors internally (for default values, 
etc.) but always use serializers/suppliers/etc obtained from the state 
descriptors.


> RocksDB does not propagate reconfiguration of serializer to the states
> ----------------------------------------------------------------------
>
>                 Key: FLINK-8715
>                 URL: https://issues.apache.org/jira/browse/FLINK-8715
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.2
>            Reporter: Arvid Heise
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> Any changes to the serializer done in #ensureCompability are lost during the 
> state creation.
> In particular, 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68]
>  always uses a fresh copy of the StateDescriptor.
> An easy fix is to pass the reconfigured serializer as an additional parameter 
> in 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681]
>  , which can be retrieved through the side-output of getColumnFamily
> {code:java}
> kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer()
> {code}
> I encountered it in 1.3.2 but the code in the master seems unchanged (hence 
> the pointer into master). I encountered it in ValueState, but I suspect the 
> same issue can be observed for all kinds of RocksDB states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to