[
https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970612#comment-15970612
]
Xiao Li commented on SPARK-20346:
---------------------------------
{quote}
The result of the COUNT and COUNT_BIG functions cannot be the null value. As
specified in the description of AVG, MAX, MIN, STDDEV, SUM, and VARIANCE, the
result is the null value when the function is applied to an empty set. However,
the result is also the null value when the function is specified in an outer
select list, the argument is given by an arithmetic expression, and any
evaluation of the expression causes an arithmetic exception (such as division
by zero).
{quote}
This is copied from
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_11.0.0/sqlref/src/tpc/db2z_aggregatefunctionsintro.html.
It is a common behavior.
Thanks!
> sum aggregate over empty Dataset gives null
> -------------------------------------------
>
> Key: SPARK-20346
> URL: https://issues.apache.org/jira/browse/SPARK-20346
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Jacek Laskowski
> Priority: Minor
>
> {code}
> scala> spark.range(0).agg(sum("id")).show
> +-------+
> |sum(id)|
> +-------+
> | null|
> +-------+
> scala> spark.range(0).agg(sum("id")).printSchema
> root
> |-- sum(id): long (nullable = true)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]