Yin Huai created SPARK-16721:
--------------------------------

             Summary: Lead/lag needs to respect nulls 
                 Key: SPARK-16721
                 URL: https://issues.apache.org/jira/browse/SPARK-16721
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Yin Huai


Seems 2.0.0 changes the behavior of lead and lag to ignore nulls. This PR is 
changing the behavior back to 1.6's behavior, which is respecting nulls.

For example 
{code}
SELECT
b,
lag(a, 1, 321) OVER (ORDER BY b) as lag,
lead(a, 1, 321) OVER (ORDER BY b) as lead
FROM (SELECT cast(null as int) as a, 1 as b
UNION ALL
select cast(null as int) as id, 2 as b) tmp
{code}
This query should return 
{code}
+---+----+----+
|  b| lag|lead|
+---+----+----+
|  1| 321|null|
|  2|null| 321|
+---+----+----+
{code}
instead of 
{code}
+---+---+----+
|  b|lag|lead|
+---+---+----+
|  1|321| 321|
|  2|321| 321|
+---+---+----+
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to