zanmato1984 commented on code in PR #45562:
URL: https://github.com/apache/arrow/pull/45562#discussion_r1971249009


##########
cpp/src/arrow/acero/scalar_aggregate_node.cc:
##########
@@ -294,6 +294,14 @@ Status ScalarAggregateNode::OutputResult(bool is_last) {
   // First, insert segment keys
   PlaceFields(batch, /*base=*/0, segmenter_values_);
 
+  // Move away the states and recreate them eagerly, to make sure that any 
error
+  // below does not leave us with empty states.
+  auto states = std::move(states_);
+  states_.resize(kernels_.size());
+  if (!is_last) {
+    RETURN_NOT_OK(ResetKernelStates());
+  }

Review Comment:
   I tried, no luck, sorry. And from the code, the case `PivotDuplicateValues` 
is using group by thus shouldn't be executing this code path. I can't speak for 
the validity of this change w/o truly understanding what the problem is.
   
   I do have a suspect though: this might be race related. And if, a big if, 
this is the case, I'm afraid your change doesn't fix the race. However there 
are still many things unclear so please don't take this seriously until I have 
new findings.
   
   Do you still hit the problem if you revert this part of the change? Or shall 
we try reverting it (maybe temporarily) to see if CI is good?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to