icexelloss commented on code in PR #34912:
URL: https://github.com/apache/arrow/pull/34912#discussion_r1177155228
##########
cpp/src/arrow/acero/hash_aggregate_test.cc:
##########
@@ -4206,6 +4206,235 @@ TEST_P(GroupBy, MinMaxWithNewGroupsInChunkedArray) {
/*verbose=*/true);
}
+TEST_P(GroupBy, FirstLastBasicTypes) {
+ std::vector<std::shared_ptr<DataType>> types;
+ types.insert(types.end(), boolean());
+ types.insert(types.end(), NumericTypes().begin(), NumericTypes().end());
+ types.insert(types.end(), TemporalTypes().begin(), TemporalTypes().end());
+
+ const std::vector<std::string> default_table = {R"([
+ [1, 1],
+ [null, 1]
+])",
+ R"([
+ [0, 2],
+ [null, 3],
+ [3, 4],
+ [5, 4],
+ [4, null],
+ [3, 1],
+ [0, 2]
+])",
+ R"([
+ [0, 2],
+ [1, null],
+ [null, 3]
+])"};
+
+ const std::string default_expected =
+ R"([
+ [1, 1, 3, null, null],
Review Comment:
Good point. Now I think of this, the behavior of `skip_nulls=false` you
described makes more sense. I will update this.
Edit: Now I think about this more. First/Last is the first aggregator that
doesn't need all values to produce a meaning result, therefore the original
comment doesn't really apply IMO. In practice, a user is less likely to want
first([1, null, 2]) to return null, even with `skip_null=false` I think that is
unexpected/surprising behavior. So @westonpace I agree with u here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]