paleolimbot commented on issue #377:
URL: 
https://github.com/apache/arrow-nanoarrow/issues/377#issuecomment-1919776518

   To the immediate question: I believe the code you have above will result in 
`nanoarrow::UniqueArray array` and `nanoarrow::UniqueSchema schema` as "unique" 
owners: as long as those C++ objects are not deleted, you can safely access the 
fields of the pointed-to `ArrowArray` and `ArrowSchema`. This is true even if 
the Python objects are deleted (the Table and/or the Capsule): the constructor 
you invoked for the `UniqueArray` and `UniqueSchema` will move ownership from 
the capsules to the `UniqueXXX`. The release callback will be called exactly 
once for each of the array and schema when the `UniqueXXX` is deleted.
   
   To the slightly larger question of how you get a `Table` into another Python 
package: I would recommend invoking `__arrow_c_array_stream__()`. I recommend 
this because then all of the looping happens in C++: if you have a table that 
for some reason has thousands of chunks, you won't pay any performance cost for 
a tight Python loop. (We did a tiny bit of work when writing the 
pyarrow/arrow-r bridges to verify that this is the case). In nanoarrow 0.4.0 
(about to be released), I added some helpers to do that looping.
   
   Your Python might look like:
   
   ```python
   parse_pyarrow_table(t.__arrow_c_array_stream__())
   ```
   
   And your C++ might look like:
   
   ```cpp
   m.def("parse_pyarrow_table",
           [](const pybind11::capsule& array_stream_capsule) {
             nanoarrow::UniqueArrayStream 
stream(static_cast<ArrowArray*>(array_stream_capsule.get_pointer()))
             nanoarrow::UniqueSchema schema;
             nanoarrow::UniqueArray array;
             
             NANOARROW_THROW_NOT_OK(ArrowArrayStreamGetSchema(stream.get(), 
schema.get()));
   
             do {
               NANOARROW_THROW_NOT_OK(ArrowArrayStreamReadNext(stream.get(), 
nullptr));
               // Do something with array
             } while (array->release != nullptr);
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to