Anna, Not sure it will help, but below is the install_arrow.sh script I am using to build arrow + pyarrow in our containers which are also based off of Ubuntu 18.04.
Matt #!/bin/bash # Taken from: https://arrow.apache.org/docs/developers/python.html#python-development # minor edits mkdir /repos cd /repos git clone https://github.com/apache/arrow.git cd arrow apt-get install -y libjemalloc-dev libboost-dev \ libboost-filesystem-dev \ libboost-system-dev \ libboost-regex-dev \ python3-dev \ autoconf \ flex \ bison pip3 install six numpy pandas cython pytest hypothesis mkdir dist export ARROW_HOME=/usr/local export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH mkdir /repos/arrow/cpp/build cd /repos/arrow/cpp/build rm /usr/bin/python ln -s /usr/bin/python3 /usr/bin/python cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DCMAKE_INSTALL_LIBDIR=lib \ -DARROW_FLIGHT=ON \ -DARROW_GANDIVA=OFF \ -DARROW_ORC=ON \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=ON \ -DPYTHON_DEFAULT_EXECUTABLE=$(which python3) \ -DPYTHON_INCLUDE_PATH=/usr/include/python3.6m \ -DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so \ -DPYTHON_INCLUDE_DIR=/usr/include/python3.6m \ .. make -j4 make install # This installs to /repos/arrow/dist cd /repos/arrow/python export PYARROW_WITH_FLIGHT=1 export PYARROW_WITH_GANDIVA=0 export PYARROW_WITH_ORC=1 export PYARROW_WITH_PARQUET=1 python setup.py build_ext python setup.py install From: Anna Waldron <[email protected]> Sent: Thursday, January 23, 2020 8:39 PM To: [email protected] Subject: Pyarrow build/install from source in ubuntu not working Hi, I am trying to build and install pyarrow from source in an ubuntu 18.04 docker image and getting the following error when attempting to import the module: Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py", line 49, in <module> from pyarrow.lib import cpu_count, set_cpu_count ImportError: libarrow.so.14: cannot open shared object file: No such file or directory Here is the Dockerfile I am using: FROM ubuntu:18.04 RUN apt-get update RUN apt-get install -y git RUN mkdir /arrow RUN git clone https://github.com/apache/arrow.git<https://clicktime.symantec.com/38kKr4FEmbbfReBM25EpjXa7Vc?u=https%3A%2F%2Fgithub.com%2Fapache%2Farrow.git> /arrow WORKDIR /arrow/arrow RUN git checkout apache-arrow-0.14.0 WORKDIR / COPY install_arrow.sh /install_arrow.sh RUN bash install_arrow.sh RUN python3 -c 'import pyarrow' and the install_arrow.sh script copied into the image: export ARROW_BUILD_TYPE=release export ARROW_HOME=/usr/local \ PARQUET_HOME=/usr/local export PYTHON_EXECUTABLE=/usr/bin/python3 # install requirements export DEBIAN_FRONTEND="noninteractive" apt-get update apt-get install -y --no-install-recommends apt-utils apt-get install -y git python3-minimal python3-pip autoconf libtool apt-get install -y cmake \ python3-dev \ libjemalloc-dev libboost-dev \ build-essential \ libboost-filesystem-dev \ libboost-regex-dev \ libboost-system-dev \ flex \ bison pip3 install --no-cache-dir six pytest numpy cython mkdir -p /arrow/cpp/build \ && cd /arrow/cpp/build \ && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \ -DOPENSSL_ROOT_DIR=/usr/local/ssl \ -DCMAKE_INSTALL_LIBDIR=lib \ -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DARROW_PLASMA=ON \ -DARROW_BUILD_TESTS=OFF \ -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \ .. \ && make -j$(nproc) \ && make install \ && cd /arrow/python \ && python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet \ && python3 setup.py install LD_LIBRARY_PATH=/usr/local/lib I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image. Thanks in advance for any help. Anna The information contained in this e-mail may be confidential and is intended solely for the use of the named addressee. Access, copying or re-use of the e-mail or any information contained therein by any other person is not authorized. If you are not the intended recipient please notify us immediately by returning the e-mail to the originator. Disclaimer Version MB.US.1
