[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

mengxr Tue, 10 Nov 2015 17:57:35 -0800

Github user mengxr commented on the pull request:

    https://github.com/apache/spark/pull/9380#issuecomment-155630541
  
    @JihongMA Could you merge the current master? There are some merge 
conflicts.
    
    For `NaN` vs. `null`, we had some discussion in 
https://issues.apache.org/jira/browse/SPARK-9079. The design is to return `NaN` 
is there exist `NaN` values in the aggregation. I think we should return `NaN` 
here, which is consistent with R and Python:
    
    ~~~R
    > mean(c())
    [1] NA
    > var(c(1))
    [1] NA
    ~~~
    
    ~~~python
    > np.mean([])
    Out[1] = na
    > np.var([1], ddof=1)
    Out[2] = nan
    ~~~
    
    @marmbrus I think we can move the implementation from imperative to 
declarative in 1.7. This PR is to re-use the `CentralMomentAgg` for `stddev`. 
It removes 70 lines of code, which is a good sign:)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: SPARK-11420 Updating Stddev support via Impera...

Reply via email to