[
https://issues.apache.org/jira/browse/SPARK-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013285#comment-15013285
]
Herman van Hovell commented on SPARK-11850:
-------------------------------------------
[~rxin]/[~yhuai] any thoughts?
> Spark StdDev/Variance defaults are incompatible with Hive
> ---------------------------------------------------------
>
> Key: SPARK-11850
> URL: https://issues.apache.org/jira/browse/SPARK-11850
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 1.6.0
> Reporter: Herman van Hovell
>
> The {{stddev}} and {{variance}} functions currently defaults to the 'sample'
> version whereas Hive uses the 'population' version for this. See:
> *
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inAggregateFunctions(UDAF)
> *
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala#L192-L196
> Is this on purpose? Or by accident?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]