[ 
https://issues.apache.org/jira/browse/SPARK-16633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385457#comment-15385457
 ] 

Yin Huai commented on SPARK-16633:
----------------------------------

Seems OffsetWindowFunctionFrame cannot distinguish the case that the offset row 
does not exist and the case that the offset row's value is null.

For example, 
{code}
SELECT
  row_number() OVER (ORDER BY id) as row_number,
  lead(id, 1, 321) OVER (ORDER BY id) as lag
FROM (SELECT cast(null as int) as id UNION ALL select cast(null as int) as id) 
tmp
{code}
The current master returns
{code}
+----------+---+
|row_number|lag|
+----------+---+
|         1|321|
|         2|321|
+----------+---+
{code}

However, the correct result is
{code}
 row_number | lag 
------------+-----
          1 |    null
          2 | 321

{code}

> lag/lead does not return the default value when the offset row does not exist
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-16633
>                 URL: https://issues.apache.org/jira/browse/SPARK-16633
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Yin Huai
>            Priority: Critical
>         Attachments: window_function_bug.html
>
>
> Please see the attached notebook. Seems lag/lead somehow fail to recognize 
> that a offset row does not exist and generate wrong results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to