vc77 opened a new issue, #47574: URL: https://github.com/apache/arrow/issues/47574
### Describe the bug, including details regarding any error messages, version, and platform. Environment: OS: Arch Linux Compiler: GCC 15.2.1 CMake: 3.9.18 (or newer) Python: Python 3.9.18 (managed by pyenv) Arrow Version: 22.0.0 (or git main) Summary: When attempting to build the Arrow C++ library with Python bindings from the arrow/cpp directory, the build process can silently fail to produce the libarrow_python.so library without a fatal error. This happens due to a combination of deprecated flags, confusing variable names, and silent dependency check failures, particularly when using pyenv. This leads to a cascade of confusing downstream linker errors when another project tries to link against the incomplete Arrow installation. Steps to Reproduce: Set up a pyenv environment with a specific Python version (e.g., 3.9.18). Clone the Arrow repository: git clone https://github.com/apache/arrow.git Attempt to build the C++ library with Python support using the documented, but deprecated, flag: Bash cd arrow/cpp && mkdir build && cd build cmake .. \ -DCMAKE_INSTALL_PREFIX=~/arrow-install \ -DPython3_EXECUTABLE=$(pyenv which python) \ -DARROW_PYTHON=ON \ -DARROW_BUILD_SHARED=ON make install Observed Behavior: The cmake command completes without a fatal error. The make install command also completes. However, the libarrow_python.so library is missing from the installation directory (~/arrow-install/lib). Verbose logs show that the PyArrow build was silently disabled because a Python3_NumPy check failed, even though NumPy itself was found. Expected Behavior: The cmake command should either: a) Succeed and produce a complete build, including libarrow_python.so. b) Fail with a clear, fatal error stating that the NumPy C development headers are a required dependency for the Python bindings. Analysis of Root Causes: After extensive debugging, we identified three core usability issues in the build system: Silent NumPy Dependency Failure: The build system correctly identifies that ARROW_PYTHON requires NumPy, but if the C headers (numpy/core/include) are not found, it prints a note and silently disables the entire component. It should instead treat this as a fatal configuration error. Confusing Deprecated Flags: The build system warns that -DARROW_PYTHON=ON is deprecated but it is the only flag that seems to trigger the Python build from the cpp/ directory. The modern-sounding -DARROW_PYARROW=ON is not recognized in this context, causing further confusion. Inconsistent Variable Naming: The flag to build shared libraries is -DARROW_BUILD_SHARED=ON. A common and intuitive alternative, -DARROW_BUILD_SHARED_LIBS=ON, is silently ignored, leading to builds that unexpectedly produce only static libraries. A fatal error for unrecognized variables would be more helpful. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org