alamb commented on code in PR #2802:
URL: https://github.com/apache/arrow-datafusion/pull/2802#discussion_r907870656
##########
datafusion/core/src/physical_optimizer/aggregate_statistics.rs:
##########
@@ -276,8 +276,8 @@ mod tests {
/// Mock data using a MemoryExec which has an exact count statistic
fn mock_data() -> Result<Arc<MemoryExec>> {
let schema = Arc::new(Schema::new(vec![
- Field::new("a", DataType::Int32, false),
- Field::new("b", DataType::Int32, false),
+ Field::new("a", DataType::Int32, true),
+ Field::new("b", DataType::Int32, true),
Review Comment:
This is a pretty easy to understand example of the issue -- prior to this
PR, the fields `"a"` and `"b"` are declared as `"nullable=false"` but then 5
lines lower `NULL` data is inserted 🤦
```rust
let batch = RecordBatch::try_new(
Arc::clone(&schema),
vec![
Arc::new(Int32Array::from(vec![Some(1), Some(2), None])),
Arc::new(Int32Array::from(vec![Some(4), None, Some(6)])),
],
)?;
```
Now that `RecordBatch::try_new` validates the nullability, the schema must
match the data otherwise an error results
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]