mustafasrepo commented on code in PR #4924:
URL: https://github.com/apache/arrow-datafusion/pull/4924#discussion_r1073201772
##########
datafusion/physical-expr/src/aggregate/count.rs:
##########
@@ -202,7 +202,9 @@ impl RowAccumulator for CountRowAccumulator {
}
fn evaluate(&self, accessor: &RowAccessor) -> Result<ScalarValue> {
- Ok(accessor.get_as_scalar(&DataType::Int64, self.state_index))
+ Ok(ScalarValue::Int64(Some(
Review Comment:
If the group, where `CountRowAccumulator` is working is empty. Without this
change it produces `NULL`, However, it should produce `0`. I cannot produce an
example for this case. Since row_accumulator only works when query contains
`GROUP BY`, and `GROUP BY` cannot produce empty group. However, as an example
``` sql
SELECT count(*) as cnt
FROM aggregate_test_100
WHERE 1 = 0;
```
will produce (where `CountAccumulator` is used)
```sql
"+-----+",
"| cnt |",
"+-----+",
"| 0 |",
"+-----+",
```
. However, if it were to work with `CountRowAccumulator`. It would produce
```sql
"+-----+",
"| cnt |",
"+-----+",
"| |",
"+-----+",
```
which is wrong. This change fixes this bug.
In summary, there is no way to reproduce this error (As far as I know) in
current implementation, since `row_accumulator` is used only when query
contains `GROUP BY` clause. However, if we were to use `row_accumulator` in
non-group cases we can encounter this issue (I recognized this behavior when
experimenting `RowAccumulator` support for non-group by aggregation). By the
way, below query would
``` sql
SELECT count(*)
FROM aggregate_test_100
WHERE 1 = 0
GROUP BY ();
```
reproduce the issue. However, it is not supported in datafusion currently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]