mustafasrepo commented on code in PR #6904:
URL: https://github.com/apache/arrow-datafusion/pull/6904#discussion_r1259891537


##########
datafusion/physical-expr/src/aggregate/groups_accumulator/accumulate.rs:
##########
@@ -0,0 +1,1042 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! [`GroupsAccumulator`] helpers: [`NullState`] and [`accumulate_indices`]
+//!
+//! [`GroupsAccumulator`]: crate::GroupsAccumulator
+
+use arrow::datatypes::ArrowPrimitiveType;
+use arrow_array::{Array, BooleanArray, PrimitiveArray};
+use arrow_buffer::{BooleanBufferBuilder, NullBuffer};
+
+/// Track the accumulator null state per row: if any values for that
+/// group were null and if any values have been seen at all for that group.
+///
+/// This is part of the inner loop for many [`GroupsAccumulator`]s,
+/// and thus the performance is critical and so there are multiple
+/// specialized implementations, invoked depending on the specific
+/// combinations of the input.
+///
+/// Typically there are 4 potential combinations of inputs must be
+/// special caseed for performance:
+///
+/// * With / Without filter
+/// * With / Without nulls in the input
+///
+/// If the input has nulls, then the accumulator must potentially
+/// handle each input null value specially (e.g. for `SUM` to mark the
+/// corresponding sum as null)
+///
+/// If there are filters present, `NullState` tracks if it has seen
+/// *any* value for that group (as some values may be filtered
+/// out). Without a filter, the accumulator is only passed groups that
+/// had at least one value to accumulate so they do not need to track
+/// if they have seen values for a particular group.
+///
+/// [`GroupsAccumulator`]: crate::GroupsAccumulator
+#[derive(Debug)]
+pub struct NullState {
+    /// Tracks if a null input value has been seen for `group_index`,
+    /// if there were any nulls in the input.
+    ///
+    /// If `null_inputs[i]` is true, have not seen any null values for
+    /// group `i`, or have not seen any vaues

Review Comment:
   ```suggestion
       /// group `i`, or have not seen any values
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to