icexelloss commented on code in PR #34912:
URL: https://github.com/apache/arrow/pull/34912#discussion_r1177155228


##########
cpp/src/arrow/acero/hash_aggregate_test.cc:
##########
@@ -4206,6 +4206,235 @@ TEST_P(GroupBy, MinMaxWithNewGroupsInChunkedArray) {
                     /*verbose=*/true);
 }
 
+TEST_P(GroupBy, FirstLastBasicTypes) {
+  std::vector<std::shared_ptr<DataType>> types;
+  types.insert(types.end(), boolean());
+  types.insert(types.end(), NumericTypes().begin(), NumericTypes().end());
+  types.insert(types.end(), TemporalTypes().begin(), TemporalTypes().end());
+
+  const std::vector<std::string> default_table = {R"([
+    [1,    1],
+    [null, 1]
+])",
+                                                  R"([
+    [0,    2],
+    [null, 3],
+    [3,    4],
+    [5,    4],
+    [4,    null],
+    [3,    1],
+    [0,    2]
+])",
+                                                  R"([
+    [0,    2],
+    [1,    null],
+    [null, 3]
+])"};
+
+  const std::string default_expected =
+      R"([
+    [1,    1,    3,    null,   null],

Review Comment:
   Good point. Now I think of this, the behavior of `skip_nulls=false` you 
described makes more sense. I will update this.
   
   Edit: Now I think about this more. First/Last is the first aggregator that 
doesn't need all values to produce a meaning result, therefore the original 
comment doesn't really apply IMO. In practice, a user is less likely to want 
first([1, null, 2]) to return null, even with `skip_null=false` I think that is 
unexpected/surprising behavior. So @westonpace I agree with u here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to