Yin Huai created SPARK-16721:
--------------------------------
Summary: Lead/lag needs to respect nulls
Key: SPARK-16721
URL: https://issues.apache.org/jira/browse/SPARK-16721
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.0
Reporter: Yin Huai
Seems 2.0.0 changes the behavior of lead and lag to ignore nulls. This PR is
changing the behavior back to 1.6's behavior, which is respecting nulls.
For example
{code}
SELECT
b,
lag(a, 1, 321) OVER (ORDER BY b) as lag,
lead(a, 1, 321) OVER (ORDER BY b) as lead
FROM (SELECT cast(null as int) as a, 1 as b
UNION ALL
select cast(null as int) as id, 2 as b) tmp
{code}
This query should return
{code}
+---+----+----+
| b| lag|lead|
+---+----+----+
| 1| 321|null|
| 2|null| 321|
+---+----+----+
{code}
instead of
{code}
+---+---+----+
| b|lag|lead|
+---+---+----+
| 1|321| 321|
| 2|321| 321|
+---+---+----+
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]