[
https://issues.apache.org/jira/browse/SPARK-38787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jungtaek Lim resolved SPARK-38787.
----------------------------------
Fix Version/s: 3.3.0
3.2.2
Resolution: Fixed
Issue resolved by pull request 36073
[https://github.com/apache/spark/pull/36073]
> Possible correctness issue on stream-stream join when handling edge case
> ------------------------------------------------------------------------
>
> Key: SPARK-38787
> URL: https://issues.apache.org/jira/browse/SPARK-38787
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 3.2.1
> Reporter: Anish Shrigondekar
> Priority: Major
> Fix For: 3.3.0, 3.2.2
>
>
> There was an issue on NPE in stream-stream join. SPARK-35659 fixed the issue
> “partially”, and the part of fix is to ignore the null value from the last
> index on swapping elements in the list so the null value in the last index is
> going to be effectively dropped. If it is due to out of sync between
> numValues and the actual number of elements, this works effectively as a
> correction.
> This unfortunately opens the possibility of another “correctness” issue; the
> reason we swap the value with last index is effectively to remove the value
> in the current index. Doing nothing in any case would mean “we don’t remove
> the value in the current index”, whereas the caller would expect the value as
> dropped, and even for outer join they may be emitted as left/right null join
> output while the value can be re-evaluated and emitted again.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]