[ 
https://issues.apache.org/jira/browse/SPARK-21037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045479#comment-16045479
 ] 

Stanislav Chernichkin commented on SPARK-21037:
-----------------------------------------------

To be more precise the problem not related to the ignoreNulls property. It 
arises when orderBy used without specifying window boundaries. In this case it 
set boundaries to UNBOUNDED PRECEDING - CURRENT ROW and all aggregation 
functions behave accordingly. The problem does not arise then orderBy not used. 
This behavior is not documented and unintuitive, popular databases do not 
require specifying window boundaries to apply aggregation function to the whole 
group (it applied to the whole group by default) and do not adjust default 
window depending on presence of ordering.

> ignoreNulls does not working properly with window functions
> -----------------------------------------------------------
>
>                 Key: SPARK-21037
>                 URL: https://issues.apache.org/jira/browse/SPARK-21037
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 2.1.0, 2.1.1
>            Reporter: Stanislav Chernichkin
>
> Following code  reproduces issue:
> spark
>       .sql("select 0 as key, null as value, 0 as order union select 0 as key, 
> 'value' as value, 1 as order")
>       .select($"*", first($"value", 
> true).over(partitionBy($"key").orderBy("order")).as("first_value"))
>       .show()
> Since documentation climes than {{first}} function will return first non-null 
> result I except to have: 
> |key|value|order|first_value|
> |  0| null|    0|       value|
> |  0|value|    1|      value|
> But actual result is: 
> |key|value|order|first_value|
> |  0| null|    0|       null|
> |  0|value|    1|      value|



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to