westonpace commented on code in PR #34912:
URL: https://github.com/apache/arrow/pull/34912#discussion_r1177143619


##########
cpp/src/arrow/acero/hash_aggregate_test.cc:
##########
@@ -4206,6 +4206,235 @@ TEST_P(GroupBy, MinMaxWithNewGroupsInChunkedArray) {
                     /*verbose=*/true);
 }
 
+TEST_P(GroupBy, FirstLastBasicTypes) {
+  std::vector<std::shared_ptr<DataType>> types;
+  types.insert(types.end(), boolean());
+  types.insert(types.end(), NumericTypes().begin(), NumericTypes().end());
+  types.insert(types.end(), TemporalTypes().begin(), TemporalTypes().end());
+
+  const std::vector<std::string> default_table = {R"([
+    [1,    1],
+    [null, 1]
+])",
+                                                  R"([
+    [0,    2],
+    [null, 3],
+    [3,    4],
+    [5,    4],
+    [4,    null],
+    [3,    1],
+    [0,    2]
+])",
+                                                  R"([
+    [0,    2],
+    [1,    null],
+    [null, 3]
+])"};
+
+  const std::string default_expected =
+      R"([
+    [1,    1,    3,    null,   null],

Review Comment:
   Ah, perhaps it is motivated by this comment:
   
   ```
   /// \brief Control general scalar aggregate kernel behavior
   ///
   /// By default, null values are ignored (skip_nulls = true).
   class ARROW_EXPORT ScalarAggregateOptions : public FunctionOptions {
    public:
     explicit ScalarAggregateOptions(bool skip_nulls = true, uint32_t min_count 
= 1);
     static constexpr char const kTypeName[] = "ScalarAggregateOptions";
     static ScalarAggregateOptions Defaults() { return 
ScalarAggregateOptions{}; }
   
     /// If true (the default), null values are ignored. Otherwise, if any 
value is null,
     /// emit null.
     bool skip_nulls;
     /// If less than this many non-null values are observed, emit null.
     uint32_t min_count;
   };
   ```
   
   @lidavidm I think you were the author of that comment.  How would you 
imagine `skip_nulls` be interpreted here?  This is a kernel that grabs the 
first/last value of the array.  My interpretation of `skip_nulls=false` was 
"grab the first/last item, whether it is null or not" which is different then 
the one you've outlined here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to