[
https://issues.apache.org/jira/browse/ARROW-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936383#comment-16936383
]
Suvayu Ali commented on ARROW-4930:
-----------------------------------
Hi [~apitrou], I have had limited success so far.
[I was working off of master, {{git describe}} says:
{{apache-arrow-0.14.0-584-g176adf5a0}}]
This is what I found:
1. {{setup.py}} makes the library directory is {{$ARROW_HOME/lib}} when setting
{{PKG_CONFIG_PATH}} in the environment (line 253). I believe this is bit of a
hack, which is also mentioned by the author in the issue that tracked that
change ARROW-1090. The resolution should be somewhere in the cmake scripts.
2. I successfully detected {{libarrow}} with the attached patch
[^FindArrow.cmake.patch].
3. However I then failed to detect {{libparquet}}. On further investigation I
found (AFAIU) that even though {{FindParquet.cmake}} sets {{ARROW_HOME}}, it is
not used. However, it does use {{PARQUET_HOME}}. Since my CMake foo is a bit
weak, I worked up a similar patch [^FindParquet.cmake.patch] as before and set
{{export PARQUET_HOME=$ARROW_HOME}} in the terminal. This allowed the
compilation to succeed.
The compilation commands I used for C++ and Python are:
{code:java}
$ cmake -G Ninja -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DARROW_FLIGHT=ON -DARROW_GANDIVA=ON -DARROW_ORC=ON \
-DARROW_PARQUET=ON -DPYTHON_EXECUTABLE=/usr/bin/python3.7m \
-DARROW_PYTHON=ON -DARROW_PLASMA=ON \
-DARROW_BUILD_TESTS=ON -DLLVM_DIR=/usr/lib64/llvm7.0 ..
$ python3 setup.py build_ext --cmake-generator Ninja --inplace
{code}
I then tried to run the python tests with {{pytest-3 pyarrow}}. The summary was:
{quote}5 failed, 1411 passed, 59 skipped, 4 xfailed, 29 warnings in 28.30
seconds
{quote}
The failures are all some kind of setup related issues, not being able to
import, not being able to start plasma, etc.
I'll investigate this further, but my take is the cmake scripts don't actually
have _one way_ of detecting the libraries, making it very difficult to
configure it properly from setup.py.
> [Python] Remove LIBDIR assumptions in Python build
> --------------------------------------------------
>
> Key: ARROW-4930
> URL: https://issues.apache.org/jira/browse/ARROW-4930
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Affects Versions: 0.12.1
> Reporter: Suvayu Ali
> Priority: Minor
> Labels: setup.py
> Fix For: 2.0.0
>
> Attachments: FindArrow.cmake.patch, FindParquet.cmake.patch
>
>
> This is in reference to (4) in
> [this|http://mail-archives.apache.org/mod_mbox/arrow-dev/201903.mbox/%3C0AF328A1-ED2A-457F-B72D-3B49C8614850%40xhochy.com%3E]
> mailing list discussion.
> Certain sections of setup.py assume a specific location of the C++ libraries.
> Removing this hard assumption will simplify PyArrow builds significantly. As
> far as I could tell these assumptions are made in the
> {{build_ext._run_cmake()}} method (wherever bundling of C++ libraries are
> handled).
> # The first occurrence is before invoking cmake (see line 237).
> # The second occurrence is when the C++ libraries are moved from their build
> directory to the Python tree (see line 347). The actual implementation is in
> the function {{_move_shared_libs_unix(..)}} (see line 468).
> Hope this helps.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)