samuelcolvin commented on code in PR #10468:
URL: https://github.com/apache/datafusion/pull/10468#discussion_r1597704992


##########
datafusion/physical-plan/src/common.rs:
##########
@@ -153,16 +153,23 @@ pub fn compute_record_batch_statistics(
         })
         .sum();
 
-    let mut column_statistics = vec![ColumnStatistics::new_unknown(); 
projection.len()];
+    let mut null_counts = vec![0; projection.len()];
 
     for partition in batches.iter() {
         for batch in partition {
             for (stat_index, col_index) in projection.iter().enumerate() {
-                column_statistics[stat_index].null_count =
-                    Precision::Exact(batch.column(*col_index).null_count());
+                null_counts[stat_index] += 
batch.column(*col_index).null_count();

Review Comment:
   There would be fewer bounds checks i think if we used `zip` here, rather 
than `.enumerate()` and lookup each column



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to