[
https://issues.apache.org/jira/browse/FLINK-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586272#comment-14586272
]
ASF GitHub Bot commented on FLINK-2210:
---------------------------------------
Github user aljoscha commented on the pull request:
https://github.com/apache/flink/pull/834#issuecomment-112127936
OK, I had some offline discussion with @mxm and @fhueske. We thought that
we should have this behaviour:
https://en.wikipedia.org/wiki/Null_(SQL)#Aggregate_functions.
The COUNT(a) should be correct with this PR since it is implemented as
SUM(1), so this counts also those elements where a = null. The only problem is
AVG(), which right now would provide incorrect results since AVG(a) is
generated as SUM(a)/SUM(1). The SUM(1) here needs to be changed to something
that only counts when a != null.
Then of course we would need tests for the correct behaviour of these
aggregations in the presence of null values.
What do you think?
> Table API aggregate by ignoring null values
> -------------------------------------------
>
> Key: FLINK-2210
> URL: https://issues.apache.org/jira/browse/FLINK-2210
> Project: Flink
> Issue Type: Bug
> Reporter: Shiti Saxena
> Assignee: Shiti Saxena
> Priority: Minor
>
> Attempting to aggregate on columns which may have null values results in
> NullPointerException.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)