[ 
https://issues.apache.org/jira/browse/FLINK-38137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-38137:
-----------------------------------
    Labels: pull-request-available  (was: )

> RocksDB State Backend Null Serialization Causes NPE and Asymmetric 
> (De)Serialization Logic
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38137
>                 URL: https://issues.apache.org/jira/browse/FLINK-38137
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>            Reporter: Ramin Gharib
>            Priority: Major
>              Labels: pull-request-available
>
> The RocksDB state backend has a critical flaw in its handling of null values, 
> which can cause NullPointerExceptions and create unnecessary serialization 
> overhead.
> h3. Problem Description
> In [AbstractRocksDBState, the 
> serializeValueNullSensitive()|https://github.com/raminqaf/flink/blob/main/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/AbstractRocksDBState.java#L179-L179]
>  method unconditionally calls the serializer even when the value is null:
> {code:java}
> <T> byte[] serializeValueNullSensitive(T value, TypeSerializer<T> serializer)
>         throws IOException {
>     dataOutputView.clear();
>     dataOutputView.writeBoolean(value == null);  // Write null flag
>     return serializeValueInternal(value, serializer);  // Always serialize, 
> even if null
> }
> private <T> byte[] serializeValueInternal(T value, TypeSerializer<T> 
> serializer)
>         throws IOException {
>     serializer.serialize(value, dataOutputView);  // Can throw NPE if value 
> is null
>     return dataOutputView.getCopyOfBuffer();
> } {code}
> This design has two major flaws:
>  # NPE Risk: The behavior becomes dependent on the TypeSerializer 
> implementation. Serializers not designed to handle null inputs will throw 
> NullPointerException.
>  # Asymmetric Logic: There's a critical mismatch between serialization and 
> deserialization:
>  * 
>  ** Serialization: Writes null flag + always attempts to serialize the value 
> object (even if null)
>  ** Deserialization: Reads null flag + immediately returns null without 
> attempting deserialization if flag is true
> h3. Evidence from Deserialization Code
> In 
> [RocksDBMapState|https://github.com/apache/flink/blob/bf1cd860617f7b51ac91516814c0e931e5bba241/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/state/rocksdb/RocksDBMapState.java#L413],
>  the deserialization logic shows this asymmetry:
> {code:java}
> private static <UV> UV deserializeValueNullSensitive(
>         byte[] rawValueBytes,
>         TypeSerializer<UV> valueSerializer)
>         throws IOException {
>     dataInputView.setBuffer(rawValueBytes);
>     boolean isNull = dataInputView.readBoolean();
>     return isNull ? null : valueSerializer.deserialize(dataInputView);  // 
> Never deserializes if null
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to