paleolimbot commented on issue #377:
URL:
https://github.com/apache/arrow-nanoarrow/issues/377#issuecomment-1919776518
To the immediate question: I believe the code you have above will result in
`nanoarrow::UniqueArray array` and `nanoarrow::UniqueSchema schema` as "unique"
owners: as long as those C++ objects are not deleted, you can safely access the
fields of the pointed-to `ArrowArray` and `ArrowSchema`. This is true even if
the Python objects are deleted (the Table and/or the Capsule): the constructor
you invoked for the `UniqueArray` and `UniqueSchema` will move ownership from
the capsules to the `UniqueXXX`. The release callback will be called exactly
once for each of the array and schema when the `UniqueXXX` is deleted.
To the slightly larger question of how you get a `Table` into another Python
package: I would recommend invoking `__arrow_c_array_stream__()`. I recommend
this because then all of the looping happens in C++: if you have a table that
for some reason has thousands of chunks, you won't pay any performance cost for
a tight Python loop. (We did a tiny bit of work when writing the
pyarrow/arrow-r bridges to verify that this is the case). In nanoarrow 0.4.0
(about to be released), I added some helpers to do that looping.
Your Python might look like:
```python
parse_pyarrow_table(t.__arrow_c_array_stream__())
```
And your C++ might look like:
```cpp
m.def("parse_pyarrow_table",
[](const pybind11::capsule& array_stream_capsule) {
nanoarrow::UniqueArrayStream
stream(static_cast<ArrowArray*>(array_stream_capsule.get_pointer()))
nanoarrow::UniqueSchema schema;
nanoarrow::UniqueArray array;
NANOARROW_THROW_NOT_OK(ArrowArrayStreamGetSchema(stream.get(),
schema.get()));
do {
NANOARROW_THROW_NOT_OK(ArrowArrayStreamReadNext(stream.get(),
nullptr));
// Do something with array
} while (array->release != nullptr);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]