alamb commented on code in PR #19625:
URL: https://github.com/apache/datafusion/pull/19625#discussion_r2662746848
##########
datafusion/functions-aggregate-common/src/aggregate/groups_accumulator/accumulate.rs:
##########
@@ -59,6 +59,10 @@ pub struct NullState {
/// If `seen_values[i]` is false, have not seen any values that
/// pass the filter yet for group `i`
seen_values: BooleanBufferBuilder,
+ /// If true, all groups seen so far have seen at least one non-null value
Review Comment:
Maybe we could encode this as an state enum so it is clearer how things are
related to `BooleamBufferBuilder`
Something like
```rust
enum SeenValues {
/// All groups seen so far have seen at least one non-null value
All {
num_values: usize,
}
// some groups have not yet seen a non-null value
Some {
values: BooleanBufferBuilder,
}
}
```
##########
datafusion/functions-aggregate/src/count.rs:
##########
@@ -598,7 +598,9 @@ impl GroupsAccumulator for CountGroupsAccumulator {
values.logical_nulls().as_ref(),
opt_filter,
|group_index| {
- self.counts[group_index] += 1;
+ // SAFETY: group_index is guaranteed to be in bounds
+ let count = unsafe {
self.counts.get_unchecked_mut(group_index) };
Review Comment:
I tried this in one of my PRs too trying to get the inner loop faster, but
couldn't seem to get a measurable difference
Maybe we need to add a micro benchmark or something
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]