The page that Aldrin linked is possible but it requires that you use the
same toolchain and version as pyarrow.  I would probably advise using the C
data API first.  By using the C data API you don't have to couple yourself
so tightly with the pyarrow build.  For example, your C++ extension can pin
itself to Arrow version 5 and people using pyarrow 11 will still be able to
use your extension without problems.

Since this question comes up fairly often I decided to create a quick
minimal example of what this might look like.  The example creates a C++
python module using pybind11.  The C++ code relies on Arrow-C++ and
interoperates with pyarrow.  You would not need to use Arrow-C++ and could
use nanoarrow or you can copy the C data API headers directly into your
project.  The example can be found at [1].

[1]: https://github.com/westonpace/arrow-cdata-example

On Tue, May 16, 2023 at 9:07 AM Aldrin <[email protected]> wrote:

> You can definitely use C++! I will see if I can find an example, but in
> the meantime there's also this page in the docs [1].
>
> [1]: https://arrow.apache.org/docs/python/integration/extending.html
>
> Sent from Proton Mail for iOS
>
>
> On Tue, May 16, 2023 at 06:32, Hinko Kocevar <[email protected]
> <On+Tue,+May+16,+2023+at+06:32,+Hinko+Kocevar+%3C%3Ca+href=>> wrote:
>
> Hi,
>
> I'm trying to understand if it is possible to have a C/C++ code (homebrew
> code) integrated into arrow such that a user of pyArrow would be able to
> utilize the homebrew functions (from python script).
>
> The idea is to pass an arrow array/table (or numpy array?) to the external
> code, let it work on the input(s) to produce an arrow output array and
> return it to the user. Again, the choice of programming language for user
> is Python. I've noticed c data interface and c stream interface as well as
> user compute functions in the docs. It is not clear to me if any of those
> support my use case and further more how do I get to utilize that in Python
> once implemented in C++.
>
> For example, something like https://numpy.org/doc/stable/user/c-info.html
> is what I would be after.
>
> Can this be done in (py)arrow, or should I just do it in numpy ?
>
> Thank you,
> Hinko
>
>

Reply via email to