jorisvandenbossche opened a new issue, #38010:
URL: https://github.com/apache/arrow/issues/38010

   https://github.com/apache/arrow/pull/37797 is adding official dunder methods 
to expose the Arrow C Data/Stream Interface in Python using PyCapsules 
(https://github.com/apache/arrow/issues/34031 / 
https://github.com/apache/arrow/issues/35531).
   
   In addition to official dunders to expose this to other libraries, we also 
need public APIs in pyarrow to import / consume such PyCapsules (or rather the 
objects implementing the dunders to give you the PyCapsule). 
   https://github.com/apache/arrow/pull/37797 already added this to the 
`pa.array(..)`, `pa.record_batch(..)` and `pa.schema(..)` constructors, such 
that you can for example create a pyarrow array with `pa.array(obj)` given any 
object `obj` that supports the interface by defining `__arrow_c_array__`. 
   
   But that's not fully complete: we certainly need a way to construct a 
`RecordBatchReader` as well, where we don't have such a factory function 
available. For this, we could add a `from_` function (similar to the existing 
`from_batches`) like `RecordBatchReader.from_stream`?
   
   (in addition there is also the Table, Field and DataType constructors, both 
those all have factory functions that could support this, similar to 
`pa.array(..)` et al)
   
   ---
   
   Secondly, I am also wondering if we want to provide APIs that accept 
PyCapsules directly, instead of an object that implements the dunders. For 
example, if you are a library that has data in Arrow compatible memory, and you 
want to convert this to pyarrow through the C Data Interface, you might want to 
use a PyCapsule directly if your library doesn't expose a Python class that 
represents that data (to avoid that you need to create a small wrapper class 
just with the dunder to pass to the pyarrow constructor, although this is of 
course not difficult).
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to