[ 
https://issues.apache.org/jira/browse/ARROW-16340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17613028#comment-17613028
 ] 

Yue Ni edited comment on ARROW-16340 at 10/5/22 2:52 PM:
---------------------------------------------------------

[~alenka]  [~jorisvandenbossche] Thanks for the help.

> it will still be installed by the python package, only not in the standard 
> "include" directory

Where is the header is expected to be installed? Is it expected to be installed 
by pyarrow's python wheel to some where and I have to add this path to my 
compiler's include path?

In my C++ project, I use vcpkg to manage dependency. And one of the module is a 
python binding for the C++ library, where I use pyarrow's C++ API like 
`arrow::py::wrap_table` together with pybind11 to create the python binding. 
Since I use vcpkg to manage dependency, I expect all C++ dependencies available 
via vcpkg.

1) Previously pyarrow C++ is part of vcpkg arrow port, and I can use CMake's 
`find_library(arrow_python)` to find the library

2) and use `find_path(arrow/python/pyarrow.h)` to find the path to the include 
directory

What I find for the latest arrow version:
1) the `libarrow_python.a` is not built, even if I set ARROW_PYTHON CMake 
option to `ON`. 

2) the `arrow/python/pyarrow.h` cannot be found in `include` directory after 
building the C++ library (at least using vcpkg arrow port)

I went through most of the comments in PR for this issue [1] and read this 
ARROW_PYTHON option issue [2] as well, and the current behavior seems to be the 
expected behavior. The `python` directory will NOT be built even if 
ARROW_PYTHON=ON.

I am not sure what the recommended approach for using pyarrow in C++. According 
to the document here [3], to make the build automated, it seems these are the 
steps:

1) install pyarrow package

2) launch python, run `pyarrow.get_include()` to get the `include` directory

3) add the `include` directory to compiler's include search path (probably via 
CMake)

4) where is it expected to find the `libarrow_python` so that CMake can find 
and use it for link?

5) build

Is this the recommended approach for doing this? I am not quite sure step #4, 
any comments on this?

I can think of another approach, which is creating another vcpkg port like 
`arrow_python` and build the `arrow_python` library explicitly, so that 
projects can use this port for such purpose. Is this a recommended approach 
after this issue? Thanks.

 

[1] [https://github.com/apache/arrow/pull/13311]

[2]https://issues.apache.org/jira/browse/ARROW-17868

[3]Using pyarrow from C++ and Cython Code, 
[https://arrow.apache.org/docs/dev/python/integration/extending.html#c-api]

 


was (Author: niyue):
[~alenka]  [~jorisvandenbossche] Thanks for the help.

> it will still be installed by the python package, only not in the standard 
> "include" directory

Where is the header is expected to be installed? Is it expected to be installed 
by pyarrow's python wheel to some where and I have to add this path to my 
compiler's include path?

In my C++ project, I use vcpkg to manage dependency. And one of the module is a 
python binding for the C++ library, where I use pyarrow's C++ API like 
`arrow::py::wrap_table` together with pybind11 to create the python binding. 
Since I use vcpkg to manage dependency, I expect all C++ dependencies available 
via vcpkg.

1) Previously pyarrow C++ is part of vcpkg arrow port, and I can use CMake's 
`find_library(arrow_python)` to find the library

2) and use `find_path(arrow/python/pyarrow.h)` to find the path to the include 
directory

What I find for the latest arrow version:
1) the `libarrow_python.a` is not built, even if I set ARROW_PYTHON CMake 
option to `ON`. 

2) the `arrow/python/pyarrow.h` cannot be found in `include` directory after 
building the C++ library (at least using vcpkg arrow port)

I went through most of the comments in PR for this issue [1] and read this 
ARROW_PYTHON option issue [2] as well, and this seems to be the expected 
behavior. The `python` directory will NOT be built even if ARROW_PYTHON=ON.

I am not sure what the recommended approach for using pyarrow in C++. According 
to the document here [3], to make the build automated, it seems these are the 
steps:

1) install pyarrow package

2) launch python, run `pyarrow.get_include()` to get the `include` directory

3) add the `include` directory to compiler's include search path (probably via 
CMake)

4) where is it expected to find the `libarrow_python` so that CMake can find 
and use it for link?

5) build

Is this the recommended approach for doing this? I am not quite sure step #4, 
any comments on this?

I can think of another approach, which is creating another vcpkg port like 
`arrow_python` and build the `arrow_python` library explicitly, so that 
projects can use this port for such purpose. Is this a recommended approach 
after this issue? Thanks.

 

[1] [https://github.com/apache/arrow/pull/13311]

[2]https://issues.apache.org/jira/browse/ARROW-17868

[3]Using pyarrow from C++ and Cython Code, 
[https://arrow.apache.org/docs/dev/python/integration/extending.html#c-api]

 

> [C++][Python] Move all Python related code into PyArrow
> -------------------------------------------------------
>
>                 Key: ARROW-16340
>                 URL: https://issues.apache.org/jira/browse/ARROW-16340
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Alenka Frim
>            Assignee: Alenka Frim
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.0.0
>
>          Time Spent: 33h 10m
>  Remaining Estimate: 0h
>
> Move {{src/arrow/python}} directory into {{pyarrow}} and arrange PyArrow to 
> build it.
> More details can be found on this thread:
> https://lists.apache.org/thread/jbxyldhqff4p9z53whhs95y4jcomdgd2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to