Anna,

Not sure it will help, but below is the install_arrow.sh script I am using to 
build arrow + pyarrow in our containers which are also based off of Ubuntu 
18.04.

Matt


#!/bin/bash

# Taken from: 
https://arrow.apache.org/docs/developers/python.html#python-development
# minor edits

mkdir /repos
cd /repos

git clone https://github.com/apache/arrow.git
cd arrow

apt-get install -y libjemalloc-dev libboost-dev \
                   libboost-filesystem-dev \
                   libboost-system-dev \
                   libboost-regex-dev \
                   python3-dev \
                   autoconf \
                   flex \
                   bison

pip3 install six numpy pandas cython pytest hypothesis

mkdir dist

export ARROW_HOME=/usr/local
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

mkdir /repos/arrow/cpp/build
cd /repos/arrow/cpp/build

rm /usr/bin/python
ln -s /usr/bin/python3 /usr/bin/python

cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
      -DCMAKE_INSTALL_LIBDIR=lib \
      -DARROW_FLIGHT=ON \
      -DARROW_GANDIVA=OFF \
      -DARROW_ORC=ON \
      -DARROW_PARQUET=ON \
      -DARROW_PYTHON=ON \
      -DARROW_PLASMA=ON \
      -DARROW_BUILD_TESTS=ON \
      -DPYTHON_DEFAULT_EXECUTABLE=$(which python3) \
      -DPYTHON_INCLUDE_PATH=/usr/include/python3.6m \
      -DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.6m.so \
      -DPYTHON_INCLUDE_DIR=/usr/include/python3.6m \
      ..
make -j4
make install # This installs to /repos/arrow/dist

cd /repos/arrow/python
export PYARROW_WITH_FLIGHT=1
export PYARROW_WITH_GANDIVA=0
export PYARROW_WITH_ORC=1
export PYARROW_WITH_PARQUET=1
python setup.py build_ext
python setup.py install



From: Anna Waldron <[email protected]>
Sent: Thursday, January 23, 2020 8:39 PM
To: [email protected]
Subject: Pyarrow build/install from source in ubuntu not working

Hi,

I am trying to build and install pyarrow from source in an ubuntu 18.04 docker 
image and getting the following error when attempting to import the module:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File 
"/usr/local/lib/python3.6/dist-packages/pyarrow-0.14.0-py3.6-linux-x86_64.egg/pyarrow/__init__.py",
 line 49, in <module>
    from pyarrow.lib import cpu_count, set_cpu_count
ImportError: libarrow.so.14: cannot open shared object file: No such file or 
directory

Here is the Dockerfile I am using:

FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install -y git
RUN mkdir /arrow
RUN git clone 
https://github.com/apache/arrow.git<https://clicktime.symantec.com/38kKr4FEmbbfReBM25EpjXa7Vc?u=https%3A%2F%2Fgithub.com%2Fapache%2Farrow.git>
 /arrow
WORKDIR /arrow/arrow
RUN git checkout apache-arrow-0.14.0
WORKDIR /
COPY install_arrow.sh /install_arrow.sh
RUN bash install_arrow.sh

RUN python3 -c 'import pyarrow'

and the install_arrow.sh script copied into the image:

export ARROW_BUILD_TYPE=release
export ARROW_HOME=/usr/local \
       PARQUET_HOME=/usr/local
export PYTHON_EXECUTABLE=/usr/bin/python3

# install requirements
export DEBIAN_FRONTEND="noninteractive"
apt-get update
apt-get install -y --no-install-recommends apt-utils
apt-get install -y git python3-minimal python3-pip autoconf libtool
apt-get install -y cmake \
                   python3-dev \
                   libjemalloc-dev libboost-dev \
                   build-essential \
                   libboost-filesystem-dev \
                   libboost-regex-dev \
                   libboost-system-dev \
                   flex \
                   bison
pip3 install --no-cache-dir six pytest numpy cython

mkdir -p /arrow/cpp/build \
  && cd /arrow/cpp/build \
  && cmake -DCMAKE_BUILD_TYPE=$ARROW_BUILD_TYPE \
           -DOPENSSL_ROOT_DIR=/usr/local/ssl \
           -DCMAKE_INSTALL_LIBDIR=lib \
           -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
           -DARROW_PARQUET=ON \
           -DARROW_PYTHON=ON \
           -DARROW_PLASMA=ON \
           -DARROW_BUILD_TESTS=OFF \
           -DPYTHON_EXECUTABLE=$PYTHON_EXECUTABLE \
           .. \
  && make -j$(nproc) \
  && make install \
  && cd /arrow/python \
  && python3 setup.py build_ext --build-type=$ARROW_BUILD_TYPE --with-parquet \
  && python3 setup.py install

LD_LIBRARY_PATH=/usr/local/lib

I'm using Docker 19.03.5 on Ubuntu 18.04.3 LTS to build the image.

Thanks in advance for any help.

Anna

The information contained in this e-mail may be confidential and is intended 
solely for the use of the named addressee.

Access, copying or re-use of the e-mail or any information contained therein by 
any other person is not authorized.

If you are not the intended recipient please notify us immediately by returning 
the e-mail to the originator.

Disclaimer Version MB.US.1

Reply via email to