kylebarron opened a new issue, #6586: URL: https://github.com/apache/arrow-rs/issues/6586
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** It is not currently possible to use arrow-rs's FFI to exchange something like an `ArrayStream` or `ChunkedArray` when those arrays do not represent RecordBatches. [`ffi_stream::ArrowArrayStreamReader`](https://docs.rs/arrow/latest/arrow/ffi_stream/struct.ArrowArrayStreamReader.html) will error if the data type of the stream is not `Struct`. This makes it impossible in the general case to interop with a `pyarrow.ChunkedArray` or `polars.Series` (via Python). The Arrow C Stream Interface _does_ support non-struct array types. `get_next()` of `ArrowArrayStream` returns an `ArrowArray`, and an `ArrowArray` can be any generic Arrow array. That Arrow array is _often_ a StructArray, with the understanding that the StructArray represents a RecordBatch, but it doesn't have to be. Here: https://github.com/apache/arrow-rs/blob/5508978a3c5c4eb65ef6410e097887a8adaba38a/arrow-array/src/ffi_stream.rs#L364-L367 you _assume_ that the data type of the stream is struct (and also assume that you can interpret the C Schema as a `Schema`), but that isn't required by the spec. To be more generic, you can [use the data type of the C Schema directly](https://github.com/kylebarron/arro3/blob/0829e34fe250314c2e068ff86e3c5e7ad003d607/pyo3-arrow/src/ffi/from_python/ffi_stream.rs#L89-L91). **Describe the solution you'd like** Some way to transfer a stream of `Array` via FFI. **Describe alternatives you've considered** There's currently no way to exchange a stream of generic arrays with arrow-rs, as far as I can tell. **Additional context** For full disclosure, I've already implemented this in my own library, pyo3-arrow. I have an [`ArrayReader`](https://docs.rs/pyo3-arrow/latest/pyo3_arrow/ffi/trait.ArrayReader.html) trait to parallel `arrow::RecordBatchReader`, and [vendored a derived copy of `ffi_stream.rs`](https://github.com/kylebarron/arro3/blob/0829e34fe250314c2e068ff86e3c5e7ad003d607/pyo3-arrow/src/ffi/from_python/ffi_stream.rs) to make it possible to handle this interop (while not necessarily materializing the entire stream as a `ChunkedArray`. I'm currently fine with my vendored copy of FFI, but others may have the same issue. Previous discussion in https://github.com/apache/arrow-rs/issues/5295#issuecomment-2402556354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
