[GitHub] [arrow] rok commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

GitBox Fri, 14 May 2021 12:35:43 -0700


rok commented on a change in pull request #9758:
URL: https://github.com/apache/arrow/pull/9758#discussion_r632754148




##########
File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc
##########
@@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator {
 
   Status Finalize(KernelContext* ctx, Datum* out) override {
     const auto& state = checked_cast<const CountImpl&>(*ctx->state());
-    switch (state.options.count_mode) {
-      case CountOptions::COUNT_NON_NULL:
-        *out = Datum(state.non_nulls);
-        break;
-      case CountOptions::COUNT_NULL:
-        *out = Datum(state.nulls);
-        break;
-      default:
-        return Status::Invalid("Unknown CountOptions encountered");
+    if (state.options.skip_nulls) {
+      *out = Datum(state.non_nulls);
+    } else {
+      *out = Datum(state.nulls);

Review comment:
       Actually if we do change to `count(list, skip_nulls=False) == 
length_of_list` option then we break with the [NaiveGroupBy 
comparison](https://github.com/apache/arrow/blob/ce2861713472818eea264957de4cc83d5a2c567c/cpp/src/arrow/compute/kernels/hash_aggregate_test.cc#L113).
 So I'd actually prefer `count(list, skip_nulls=False) == null_count`.
   I suppose it's less intuitive but  it's more consistent.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] rok commented on a change in pull request #9758: ARROW-9054: [C++] Add ScalarAggregateOptions

Reply via email to