kylebarron commented on issue #5295:
URL: https://github.com/apache/arrow-rs/issues/5295#issuecomment-2402547679

   I was trying to express that there are some benefits to making a sequence of 
`ArrayRef` a first-class citizen. One of which would be FFI support.
   
   Aside from FFI, I think a concept like an 
[`ArrayIterator`](https://docs.rs/pyo3-arrow/latest/pyo3_arrow/ffi/struct.ArrayIterator.html)
 (a direct corollary to `RecordBatchIterator`) is useful over `impl 
Iterator<Item=ArrayRef>` because it gives type-level validation that all the 
arrays in the iterator have the same `DataType`, while still avoiding the 
`ChunkedArray` pitfall of requiring all arrays to be in-memory.
   
   > if arrow FFI has a mechanism to transport bare arrays we should support 
that. Although reading 
[arrow.apache.org/docs/format/CStreamInterface.html](https://arrow.apache.org/docs/format/CStreamInterface.html)
 I'm not sure how this would work
   
   We can discuss this on a separate issue if you'd prefer. Arrow FFI _only_ 
transports bare arrays. `get_next()` of `ArrowArrayStream` returns an 
`ArrowArray`, and an `ArrowArray` can be any generic Arrow array. That Arrow 
array is often a StructArray, with the understanding that the StructArray 
represents a RecordBatch, but it doesn't have to be.
   
   Here:
   
https://github.com/apache/arrow-rs/blob/5508978a3c5c4eb65ef6410e097887a8adaba38a/arrow-array/src/ffi_stream.rs#L364-L367
   you _assume_ that the data type of the stream is struct (and also assume 
that you can interpret the C Schema as a `Schema`), but that isn't required by 
the spec. To be more generic, you can [use the data type of the C Schema 
directly](https://github.com/kylebarron/arro3/blob/0829e34fe250314c2e068ff86e3c5e7ad003d607/pyo3-arrow/src/ffi/from_python/ffi_stream.rs#L89-L91).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to