rtpsw commented on code in PR #34311:
URL: https://github.com/apache/arrow/pull/34311#discussion_r1117553145
##########
cpp/src/arrow/compute/exec/aggregate_node.cc:
##########
@@ -326,46 +446,86 @@ class ScalarAggregateNode : public ExecNode, public
TracedNode {
}
private:
- Status Finish() {
- auto scope = TraceFinish();
+ Status ReconstructAggregates() {
+ const auto& input_schema = *inputs()[0]->output_schema();
+ auto exec_ctx = plan()->query_context()->exec_context();
+ for (size_t i = 0; i < kernels_.size(); ++i) {
+ std::vector<TypeHolder> in_types;
+ for (const auto& target : target_fieldsets_[i]) {
+ in_types.emplace_back(input_schema.field(target)->type().get());
+ }
+ states_[i].resize(plan()->query_context()->max_concurrency());
Review Comment:
This pattern exists pre-PR and is used to support multi-threading (which,
per our decision, is currently not supported in segmented aggregation but is in
regular aggregation). Each worker threads is associated with a unique index
from 0 to `max_concurrency - 1`. The code here creates a dedicated state
instance per thread.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]