[
https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385457#comment-15385457
]
Yin Huai commented on SPARK-16633:
----------------------------------
Seems OffsetWindowFunctionFrame cannot distinguish the case that the offset row
does not exist and the case that the offset row's value is null.
For example,
{code}
SELECT
row_number() OVER (ORDER BY id) as row_number,
lead(id, 1, 321) OVER (ORDER BY id) as lag
FROM (SELECT cast(null as int) as id UNION ALL select cast(null as int) as id)
tmp
{code}
The current master returns
{code}
+----------+---+
|row_number|lag|
+----------+---+
| 1|321|
| 2|321|
+----------+---+
{code}
However, the correct result is
{code}
row_number | lag
------------+-----
1 | null
2 | 321
{code}
> lag/lead does not return the default value when the offset row does not exist
> -----------------------------------------------------------------------------
>
> Key: SPARK-16633
> URL: https://issues.apache.org/jira/browse/SPARK-16633
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0
> Reporter: Yin Huai
> Priority: Critical
> Attachments: window_function_bug.html
>
>
> Please see the attached notebook. Seems lag/lead somehow fail to recognize
> that a offset row does not exist and generate wrong results.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]