vc77 opened a new issue, #47574:
URL: https://github.com/apache/arrow/issues/47574

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Environment:
   
       OS: Arch Linux
   
       Compiler: GCC 15.2.1
   
       CMake: 3.9.18 (or newer)
   
       Python: Python 3.9.18 (managed by pyenv)
   
       Arrow Version: 22.0.0 (or git main)
   
   Summary:
   
   When attempting to build the Arrow C++ library with Python bindings from the 
arrow/cpp directory, the build process can silently fail to produce the 
libarrow_python.so library without a fatal error. This happens due to a 
combination of deprecated flags, confusing variable names, and silent 
dependency check failures, particularly when using pyenv. This leads to a 
cascade of confusing downstream linker errors when another project tries to 
link against the incomplete Arrow installation.
   
   Steps to Reproduce:
   
       Set up a pyenv environment with a specific Python version (e.g., 3.9.18).
   
       Clone the Arrow repository: git clone https://github.com/apache/arrow.git
   
       Attempt to build the C++ library with Python support using the 
documented, but deprecated, flag:
       Bash
   
       cd arrow/cpp && mkdir build && cd build
       cmake .. \
         -DCMAKE_INSTALL_PREFIX=~/arrow-install \
         -DPython3_EXECUTABLE=$(pyenv which python) \
         -DARROW_PYTHON=ON \
         -DARROW_BUILD_SHARED=ON
       make install
   
   Observed Behavior:
   
   The cmake command completes without a fatal error. The make install command 
also completes. However, the libarrow_python.so library is missing from the 
installation directory (~/arrow-install/lib). Verbose logs show that the 
PyArrow build was silently disabled because a Python3_NumPy check failed, even 
though NumPy itself was found.
   
   Expected Behavior:
   
   The cmake command should either:
   a) Succeed and produce a complete build, including libarrow_python.so.
   b) Fail with a clear, fatal error stating that the NumPy C development 
headers are a required dependency for the Python bindings.
   
   Analysis of Root Causes:
   
   After extensive debugging, we identified three core usability issues in the 
build system:
   
       Silent NumPy Dependency Failure: The build system correctly identifies 
that ARROW_PYTHON requires NumPy, but if the C headers (numpy/core/include) are 
not found, it prints a note and silently disables the entire component. It 
should instead treat this as a fatal configuration error.
   
       Confusing Deprecated Flags: The build system warns that 
-DARROW_PYTHON=ON is deprecated but it is the only flag that seems to trigger 
the Python build from the cpp/ directory. The modern-sounding 
-DARROW_PYARROW=ON is not recognized in this context, causing further confusion.
   
       Inconsistent Variable Naming: The flag to build shared libraries is 
-DARROW_BUILD_SHARED=ON. A common and intuitive alternative, 
-DARROW_BUILD_SHARED_LIBS=ON, is silently ignored, leading to builds that 
unexpectedly produce only static libraries. A fatal error for unrecognized 
variables would be more helpful.
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to