icexelloss commented on code in PR #34912:
URL: https://github.com/apache/arrow/pull/34912#discussion_r1177155228


##########
cpp/src/arrow/acero/hash_aggregate_test.cc:
##########
@@ -4206,6 +4206,235 @@ TEST_P(GroupBy, MinMaxWithNewGroupsInChunkedArray) {
                     /*verbose=*/true);
 }
 
+TEST_P(GroupBy, FirstLastBasicTypes) {
+  std::vector<std::shared_ptr<DataType>> types;
+  types.insert(types.end(), boolean());
+  types.insert(types.end(), NumericTypes().begin(), NumericTypes().end());
+  types.insert(types.end(), TemporalTypes().begin(), TemporalTypes().end());
+
+  const std::vector<std::string> default_table = {R"([
+    [1,    1],
+    [null, 1]
+])",
+                                                  R"([
+    [0,    2],
+    [null, 3],
+    [3,    4],
+    [5,    4],
+    [4,    null],
+    [3,    1],
+    [0,    2]
+])",
+                                                  R"([
+    [0,    2],
+    [1,    null],
+    [null, 3]
+])"};
+
+  const std::string default_expected =
+      R"([
+    [1,    1,    3,    null,   null],

Review Comment:
   Good point. Now I think of this, the behavior of `skip_nulls=false` you 
described makes more sense. I will update this.
   
   Edit: More thinking: First/Last is the first aggregator that doesn't need 
all values to produce a meaning result, therefore the original comment doesn't 
really apply IMO. In practice, a user is less likely to want first([1, null, 
2]) to return null, even with `skip_null=false` I think that is 
unexpected/surprising behavior. So @westonpace I agree with u here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to