kylebarron commented on code in PR #8790:
URL: https://github.com/apache/arrow-rs/pull/8790#discussion_r2582768750
##########
arrow-array/src/ffi_stream.rs:
##########
@@ -364,7 +364,13 @@ impl Iterator for ArrowArrayStreamReader {
let result = unsafe {
from_ffi_and_data_type(array,
DataType::Struct(self.schema().fields().clone()))
};
- Some(result.map(|data| RecordBatch::from(StructArray::from(data))))
+ Some(result.map(|data| {
+ let struct_array = StructArray::from(data);
Review Comment:
> Basically what I explained here ([#8790
(comment)](https://github.com/apache/arrow-rs/pull/8790#issuecomment-3536535648))
=> `StructArray` alone by definition is metadata-less, in turn leading to the
problem that the resulting `RecordBatch` won't have any metadata attached if
you just return it as-is.
>
> I'm not sure whether there is another more elegant way to construct a
`RecordBatch` with corresponding metadata from `ArrayData`. Right now I'm going
through `StructArray` because the previous interface did that too. If there is
another more elegant way, please let me know.
I've complained about this before (though I can't find in what issue), and
is one of [the reasons I
document](https://docs.rs/pyo3-arrow/0.15.0/pyo3_arrow/#why-not-use-arrow-rss-python-integration)
for why I created pyo3-arrow. It's currently impossible (I believe) in
arrow-rs to persist extension metadata through the FFI interface.
I think we need a broader PR to handle this though; it shouldn't be
shoehorned into this PR that is focused on the PyArrow Table handling
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]