Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155630541
@JihongMA Could you merge the current master? There are some merge
conflicts.
For `NaN` vs. `null`, we had some discussion in
https://issues.apache.org/jira/browse/SPARK-9079. The design is to return `NaN`
is there exist `NaN` values in the aggregation. I think we should return `NaN`
here, which is consistent with R and Python:
~~~R
> mean(c())
[1] NA
> var(c(1))
[1] NA
~~~
~~~python
> np.mean([])
Out[1] = na
> np.var([1], ddof=1)
Out[2] = nan
~~~
@marmbrus I think we can move the implementation from imperative to
declarative in 1.7. This PR is to re-use the `CentralMomentAgg` for `stddev`.
It removes 70 lines of code, which is a good sign:)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]