[ 
https://issues.apache.org/jira/browse/BEAM-12647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385180#comment-17385180
 ] 

Kyle Weaver commented on BEAM-12647:
------------------------------------

It seems no results are being produced because aggregate functions are 
implemented by using KV pairs. All values are assigned a key, then a GBK is 
followed by 
[Combine.GroupedValues|https://beam.apache.org/releases/javadoc/2.31.0/org/apache/beam/sdk/transforms/Combine.GroupedValues.html]

Beam always treats key/values as a pair. There can be values without keys, but 
there's no concept of a key with no values. So Combine.GroupedValues on an 
empty PCollection correctly returns nothing.

The question is, why do we always use Combine.GroupedValues even when we aren't 
using GROUP BY? Currently, if GROUP BY is omitted, we assign a key (K = the 
empty Row) to all elements. Which if I understand correctly would also be bad 
for performance. So I think the fix here is to use Combine.Globally when 
there's no GROUP BY.

> Aggregations on empty pcoll don't return a value.
> -------------------------------------------------
>
>                 Key: BEAM-12647
>                 URL: https://issues.apache.org/jira/browse/BEAM-12647
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql, dsl-sql-zetasql
>            Reporter: Kyle Weaver
>            Priority: P3
>
> {{For example, "SELECT COUNT(\*) FROM table_empty" should return 0, but 
> instead it returns no value.}}
> cc [~benglez] [~apilloud]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to