[
https://issues.apache.org/jira/browse/SPARK-38787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Rosen updated SPARK-38787:
-------------------------------
Labels: correctness (was: )
> Possible correctness issue on stream-stream join when handling edge case
> ------------------------------------------------------------------------
>
> Key: SPARK-38787
> URL: https://issues.apache.org/jira/browse/SPARK-38787
> Project: Spark
> Issue Type: Sub-task
> Components: Structured Streaming
> Affects Versions: 3.2.1
> Reporter: Anish Shrigondekar
> Assignee: Anish Shrigondekar
> Priority: Major
> Labels: correctness
> Fix For: 3.3.0, 3.2.2
>
>
> There was an issue on NPE in stream-stream join. SPARK-35659 fixed the issue
> “partially”, and the part of fix is to ignore the null value from the last
> index on swapping elements in the list so the null value in the last index is
> going to be effectively dropped. If it is due to out of sync between
> numValues and the actual number of elements, this works effectively as a
> correction.
> This unfortunately opens the possibility of another “correctness” issue; the
> reason we swap the value with last index is effectively to remove the value
> in the current index. Doing nothing in any case would mean “we don’t remove
> the value in the current index”, whereas the caller would expect the value as
> dropped, and even for outer join they may be emitted as left/right null join
> output while the value can be re-evaluated and emitted again.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]