Ramin Gharib created FLINK-38137:
------------------------------------
Summary: RocksDB State Backend Null Serialization Causes NPE and
Asymmetric (De)Serialization Logic
Key: FLINK-38137
URL: https://issues.apache.org/jira/browse/FLINK-38137
Project: Flink
Issue Type: Bug
Components: Runtime / State Backends
Reporter: Ramin Gharib
The RocksDB state backend has a critical flaw in its handling of null values,
which can cause NullPointerExceptions and create unnecessary serialization
overhead.
h3. Problem Description
In [AbstractRocksDBState, the
serializeValueNullSensitive()|https://github.com/raminqaf/flink/blob/main/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L179-L179]
method unconditionally calls the serializer even when the value is null:
{code:java}
<T> byte[] serializeValueNullSensitive(T value, TypeSerializer<T> serializer)
throws IOException {
dataOutputView.clear();
dataOutputView.writeBoolean(value == null); // Write null flag
return serializeValueInternal(value, serializer); // Always serialize,
even if null
}
private <T> byte[] serializeValueInternal(T value, TypeSerializer<T> serializer)
throws IOException {
serializer.serialize(value, dataOutputView); // Can throw NPE if value is
null
return dataOutputView.getCopyOfBuffer();
} {code}
This design has two major flaws:
# NPE Risk: The behavior becomes dependent on the TypeSerializer
implementation. Serializers not designed to handle null inputs will throw
NullPointerException.
# Asymmetric Logic: There's a critical mismatch between serialization and
deserialization:
*
** Serialization: Writes null flag + always attempts to serialize the value
object (even if null)
*
** Deserialization: Reads null flag + immediately returns null without
attempting deserialization if flag is true
h3. Evidence from Deserialization Code
In
[RocksDBMapState|https://github.com/apache/flink/blob/bf1cd860617f7b51ac91516814c0e931e5bba241/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/RocksDBMapState.java#L413],
the deserialization logic shows this asymmetry:
{code:java}
private static <UV> UV deserializeValueNullSensitive(
byte[] rawValueBytes,
TypeSerializer<UV> valueSerializer)
throws IOException {
dataInputView.setBuffer(rawValueBytes);
boolean isNull = dataInputView.readBoolean();
return isNull ? null : valueSerializer.deserialize(dataInputView); //
Never deserializes if null
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)