Eric Marnadi created SPARK-54743:
------------------------------------

             Summary: Fix NPE in getJoinedRows when 
skipNullsForStreamStreamJoins is enabled
                 Key: SPARK-54743
                 URL: https://issues.apache.org/jira/browse/SPARK-54743
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 4.2.0
            Reporter: Eric Marnadi


When skipNullsForStreamStreamJoins is enabled, null values can exist in the 
state store. The getJoinedRows() method would fail with NPE when:
 # It encounters a null value while iterating
 # It tries to generate a joined row from the null value
 # It attempts to update the matched flag by calling put() with the null value

This bug affects both state format versions 2 and 3, as they share the same 
KeyWithIndexToValueRowConverterFormatV2 implementation. The only
difference between v2 and v3 is that v3 uses virtual column families.

The issue was not caught by existing unit tests because they only test the 
get() method, not getJoinedRows(). The get() method simply returns values 
without attempting to update matched flags or put values back to the store.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to