[jira] [Commented] (SPARK-17423) Support IGNORE NULLS option in Window functions
[ https://issues.apache.org/jira/browse/SPARK-17423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16133095#comment-16133095 ] Gaurav Shah commented on SPARK-17423: - is there a way to do second non-null value ? changing the window to {code:sql} ROWS BETWEEN UNBOUNDED PRECEDING AND 2 PRECEDING {code} would not help > Support IGNORE NULLS option in Window functions > --- > > Key: SPARK-17423 > URL: https://issues.apache.org/jira/browse/SPARK-17423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.0.0 >Reporter: Tim Chan >Priority: Minor > > http://stackoverflow.com/questions/24338119/is-it-possible-to-ignore-null-values-when-using-lag-and-lead-functions-in-sq -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17423) Support IGNORE NULLS option in Window functions
[ https://issues.apache.org/jira/browse/SPARK-17423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15472394#comment-15472394 ] Tim Chan commented on SPARK-17423: -- [~hvanhovell] I was able to rewrite this Redshift fragment: {code:sql} DATEDIFF(day, LAG(CASE WHEN SUM(activities.activity_one, activities.activity_two) > 0 THEN activities.date END) IGNORE NULLS OVER (PARTITION BY activities.user_id ORDER BY activities.date), activities.date ) AS days_since_last_activity {code} as this Spark SQL fragment: {code:sql} DATEDIFF(activities.date, LAST(CASE WHEN SUM(activities.activity_one, activities.activity_two) > 0 THEN activities.date END, true) OVER (PARTITION BY activities.user_id ORDER BY activities.date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)) AS days_since_last_activity {code} Thanks for pointing me in the right direction. > Support IGNORE NULLS option in Window functions > --- > > Key: SPARK-17423 > URL: https://issues.apache.org/jira/browse/SPARK-17423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.0.0 >Reporter: Tim Chan >Priority: Minor > > http://stackoverflow.com/questions/24338119/is-it-possible-to-ignore-null-values-when-using-lag-and-lead-functions-in-sq -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17423) Support IGNORE NULLS option in Window functions
[ https://issues.apache.org/jira/browse/SPARK-17423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468973#comment-15468973 ] Herman van Hovell commented on SPARK-17423: --- IMO LEAD and LAG with ignore nulls seem a bit dodgy to me. We support `FIRST` and `LAST` with ignore nulls. You could use this in combination with a window frame to create the same functionality. For instance, the following query would find the first non null value within the leading 5 rows: {noformat} SELECT *, FIRST(value, true) OVER (PARTITION BY grp ORDER BY date ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM tbl {noformat} > Support IGNORE NULLS option in Window functions > --- > > Key: SPARK-17423 > URL: https://issues.apache.org/jira/browse/SPARK-17423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.0.0 >Reporter: Tim Chan >Priority: Minor > > http://stackoverflow.com/questions/24338119/is-it-possible-to-ignore-null-values-when-using-lag-and-lead-functions-in-sq -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org