Eric Marnadi created SPARK-54743:
------------------------------------
Summary: Fix NPE in getJoinedRows when
skipNullsForStreamStreamJoins is enabled
Key: SPARK-54743
URL: https://issues.apache.org/jira/browse/SPARK-54743
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 4.2.0
Reporter: Eric Marnadi
When skipNullsForStreamStreamJoins is enabled, null values can exist in the
state store. The getJoinedRows() method would fail with NPE when:
# It encounters a null value while iterating
# It tries to generate a joined row from the null value
# It attempts to update the matched flag by calling put() with the null value
This bug affects both state format versions 2 and 3, as they share the same
KeyWithIndexToValueRowConverterFormatV2 implementation. The only
difference between v2 and v3 is that v3 uses virtual column families.
The issue was not caught by existing unit tests because they only test the
get() method, not getJoinedRows(). The get() method simply returns values
without attempting to update matched flags or put values back to the store.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]