tustvold commented on code in PR #3953:
URL: https://github.com/apache/arrow-rs/pull/3953#discussion_r1153733135
##########
arrow-array/src/record_batch.rs:
##########
@@ -447,23 +447,28 @@ impl Default for RecordBatchOptions {
Self::new()
}
}
+impl From<StructArray> for RecordBatch {
+ fn from(value: StructArray) -> Self {
+ assert_eq!(
+ value.null_count(),
+ 0,
+ "Cannot convert nullable StructArray to RecordBatch"
Review Comment:
> to not simply be the schema of the Struct array itself
Not all child types are nullable, as a result of their encoding. This would
not just be a schema-only operation. Taking a step back though, silently
changing the schema based on the presence of nulls seems like a very unexpected
behaviour? DataFusion would get very confused if one of its RecordBatch
suddenly had a different schema?
TBC this conversion from RecordBatch to StructArray should only be being
attempted at a top-level where you can statically guarantee there won't be any
nulls present - e.g. the top-level of a parquet schema, panicking if you
violate that invariant seems perfectly reasonable to me...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]