[ 
https://issues.apache.org/jira/browse/SPARK-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin resolved SPARK-9079.
--------------------------------
       Resolution: Fixed
         Assignee: Michael Armbrust
    Fix Version/s: 1.5.0

> Design NaN semantics
> --------------------
>
>                 Key: SPARK-9079
>                 URL: https://issues.apache.org/jira/browse/SPARK-9079
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Michael Armbrust
>             Fix For: 1.5.0
>
>
> 1. What should NaN = NaN return?
> NaN = NaN should return true.
> 2. If we see NaN in the group by key column, should we group NaN values into 
> one group, or into different groups?
> All NaN values should be grouped together.
> 3. What about NaN in join keys?
> NaN should be treated as a normal value in join keys.
> 4. When aggregating over columns containing NaN, should the result be NaN, or 
> should the result exclude NaN values (treating them like nulls)?
> This is TO BE DECIDED. By default, the behavior is to return NaN.
> 5. Where should NaN go in sorting?
> NaN should go last when in ascending order, larger than any other numeric 
> value.
> Note that 5 is much more important than the other 4 since right now the 
> sorter throws exceptions on NaN values. See SPARK-8797.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to