Re: [PR] chore: fix native shuffle for batches with no columns and 0 row count [datafusion-comet]

via GitHub Thu, 02 Apr 2026 11:22:27 -0700


mbutrovich commented on code in PR #3858:
URL: https://github.com/apache/datafusion-comet/pull/3858#discussion_r3029613996



##########
native/shuffle/src/partitioners/multi_partition.rs:
##########
@@ -203,6 +203,36 @@ impl MultiPartitionShuffleRepartitioner {
             return Ok(());
         }
 
+        // For zero-column schemas (e.g. COUNT queries), assign all rows to 
partition 0.
+        // No hashing or expression evaluation needed — just route through 
normal buffering.
+        if input.num_columns() == 0 {
+            let num_rows = input.num_rows();
+            self.metrics.baseline.record_output(num_rows);
+            // All rows go to partition 0: partition_starts = [0, num_rows, 
num_rows, ...]
+            // partition_row_indices = [0, 1, 2, ..., num_rows-1]
+            let mut scratch = std::mem::take(&mut self.scratch);

Review Comment:
   This still looks way more complicated than what I would expect. Why do we 
need scratch space and to write `num_rows` `partition_row_indices`. Why are we 
"partitioning" rows that don't exist?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: fix native shuffle for batches with no columns and 0 row count [datafusion-comet]

Reply via email to