alamb commented on code in PR #10468: URL: https://github.com/apache/datafusion/pull/10468#discussion_r1598897904
########## datafusion/physical-plan/src/common.rs: ########## @@ -687,4 +694,42 @@ mod tests { assert_eq!(actual, expected); Ok(()) } + + #[test] + fn test_compute_record_batch_statistics_null() -> Result<()> { + let schema = Arc::new(Schema::new(vec![ + Field::new("u64", DataType::UInt64, true), + ])); + let batch1 = RecordBatch::try_new( + Arc::clone(&schema), + vec![ + Arc::new(UInt64Array::from(vec![Some(1), None, None])), + ], + )?; + let batch2 = RecordBatch::try_new( + Arc::clone(&schema), + vec![ + Arc::new(UInt64Array::from(vec![Some(1), Some(2), None])), + ], + )?; + let byte_size = batch1.get_array_memory_size() + batch2.get_array_memory_size(); + let actual = + compute_record_batch_statistics(&[vec![batch1], vec![batch2]], &schema, None); + + let expected = Statistics { + num_rows: Precision::Exact(6), + total_byte_size: Precision::Exact(byte_size), + column_statistics: vec![ + ColumnStatistics { + distinct_count: Precision::Absent, + max_value: Precision::Absent, + min_value: Precision::Absent, + null_count: Precision::Exact(3), + }, + ], + }; + + assert_eq!(actual, expected); Review Comment: I verified test coverage by running this test without the code in this PR and the test fails like this ``` assertion `left == right` failed left: Statistics { num_rows: Exact(6), total_byte_size: Exact(368), column_statistics: [ColumnStatistics { null_count: Exact(1), max_value: Absent, min_value: Absent, distinct_count: Absent }] } right: Statistics { num_rows: Exact(6), total_byte_size: Exact(368), column_statistics: [ColumnStatistics { null_count: Exact(3), max_value: Absent, min_value: Absent, distinct_count: Absent }] } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org