[ https://issues.apache.org/jira/browse/FLINK-38137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-38137: ----------------------------------- Labels: pull-request-available (was: ) > RocksDB State Backend Null Serialization Causes NPE and Asymmetric > (De)Serialization Logic > ------------------------------------------------------------------------------------------ > > Key: FLINK-38137 > URL: https://issues.apache.org/jira/browse/FLINK-38137 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Reporter: Ramin Gharib > Priority: Major > Labels: pull-request-available > > The RocksDB state backend has a critical flaw in its handling of null values, > which can cause NullPointerExceptions and create unnecessary serialization > overhead. > h3. Problem Description > In [AbstractRocksDBState, the > serializeValueNullSensitive()|https://github.com/raminqaf/flink/blob/main/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L179-L179] > method unconditionally calls the serializer even when the value is null: > {code:java} > <T> byte[] serializeValueNullSensitive(T value, TypeSerializer<T> serializer) > throws IOException { > dataOutputView.clear(); > dataOutputView.writeBoolean(value == null); // Write null flag > return serializeValueInternal(value, serializer); // Always serialize, > even if null > } > private <T> byte[] serializeValueInternal(T value, TypeSerializer<T> > serializer) > throws IOException { > serializer.serialize(value, dataOutputView); // Can throw NPE if value > is null > return dataOutputView.getCopyOfBuffer(); > } {code} > This design has two major flaws: > # NPE Risk: The behavior becomes dependent on the TypeSerializer > implementation. Serializers not designed to handle null inputs will throw > NullPointerException. > # Asymmetric Logic: There's a critical mismatch between serialization and > deserialization: > * > ** Serialization: Writes null flag + always attempts to serialize the value > object (even if null) > ** Deserialization: Reads null flag + immediately returns null without > attempting deserialization if flag is true > h3. Evidence from Deserialization Code > In > [RocksDBMapState|https://github.com/apache/flink/blob/bf1cd860617f7b51ac91516814c0e931e5bba241/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/RocksDBMapState.java#L413], > the deserialization logic shows this asymmetry: > {code:java} > private static <UV> UV deserializeValueNullSensitive( > byte[] rawValueBytes, > TypeSerializer<UV> valueSerializer) > throws IOException { > dataInputView.setBuffer(rawValueBytes); > boolean isNull = dataInputView.readBoolean(); > return isNull ? null : valueSerializer.deserialize(dataInputView); // > Never deserializes if null > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)