Oh - important detail, the directions that I was following are in 
https://arrow.apache.org/docs/developers/python.html .

Steve

On 2023/11/27 18:38:55 Akshara Sadheesh wrote:
> Thank you so much for your reply Raul! So I did run the build using the 
> build_venv.sh file. The issue was I think I did not copy over the libarrow.so 
> files from my docker container in the `root/dist/lib` directory. I have added 
> them onto `arrow/python/pyarrow`.
>
> After the build finished I copied over the libarrow.so files from 
> `root/dist/lib` in my container to my host machine and added the libarrow.so 
> files to the `arrow/python/pyarrow` folder. This got rid of the missing 
> libarrow.so files error.
>
> I then added this new pyarrow folder to my lambda layers folder, the 
> deploy.sh script will take care of building out the new environment using 
> codebuild. I am using a managed Ubuntu Standard 6.0 image 
> (https://github.com/aws/aws-codebuild-docker-images/blob/master/ubuntu/standard/6.0/Dockerfile).
>  This uses glibc version 2.35. As much as possible I would like to avoid 
> changing the glibc version for this as it is a managed image.
>
> Issue:
>
> The issue is when I add the custom pyarrow to my lambda layers and run the 
> step function I get this error:
>
> `GLIBC_2.32* not found (required by 
> /opt/python/pyarrow/lib.cpython-310-x86_64-linux-gnu.so 
> <http://lib.cpython-310-x86_64-linux-gnu.so/>)`
>
> I keep bumping into a glibc version error. This error is present even after 
> modifying the Dockerfile to use the same base image the code build managed 
> image uses with GLIBC 2.35.
>
> This is the modified `arrow/python/examples/minimal_build/Dockerfile.ubuntu` 
> used:
>
> `
>
> FROM public.ecr.aws/ubuntu/ubuntu:22.04
>
> ENV DEBIAN_FRONTEND=noninteractive
>
> RUN apt-get update -y -q && \
> apt-get install -y -q --no-install-recommends \
> apt-transport-https \
> software-properties-common \
> wget && \
> apt-get install -y -q --no-install-recommends \
> build-essential \
> cmake \
> git \
> ninja-build \
> python3.10 \
> python3.10-dev \
> python3.10-venv \
> && \
> apt-get clean && rm -rf /var/lib/apt/lists*
>
> # Set Python 3.10 as the default Python version
> RUN update-alternatives --install /usr/bin/python3 python3 
> /usr/bin/python3.10 1
>
> RUN wget https://bootstrap.pypa.io/get-pip.py && \
> python3 get-pip.py && \
> rm get-pip.py
>
> `
>
> This is the `arrow/python/examples/minimal_build/build_venv.sh` used:
>
>
> `
>
> #!/usr/bin/env bash
> # Licensed to the Apache Software Foundation (ASF) under one
> # or more contributor license agreements. See the NOTICE file
> # distributed with this work for additional information
> # regarding copyright ownership. The ASF licenses this file
> # to you under the Apache License, Version 2.0 (the
> # "License"); you may not use this file except in compliance
> # with the License. You may obtain a copy of the License at
> #
> # http://www.apache.org/licenses/LICENSE-2.0
> #
> # Unless required by applicable law or agreed to in writing,
> # software distributed under the License is distributed on an
> # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> # KIND, either express or implied. See the License for the
> # specific language governing permissions and limitations
> # under the License.
>
> set -e
>
> #----------------------------------------------------------------------
> # Change this to whatever makes sense for your system
>
> WORKDIR=${WORKDIR:-$HOME}
> MINICONDA=$WORKDIR/miniconda-for-arrow
> LIBRARY_INSTALL_DIR=$WORKDIR/local-libs
> CPP_BUILD_DIR=$WORKDIR/arrow-cpp-build
> ARROW_ROOT=/arrow
> export ARROW_HOME=$WORKDIR/dist
> export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH
>
> python3 -m venv $WORKDIR/venv
> source $WORKDIR/venv/bin/activate
>
> git config --global --add safe.directory $ARROW_ROOT
>
> pip install -r $ARROW_ROOT/python/requirements-build.txt
>
> #----------------------------------------------------------------------
> # Build C++ library
>
> mkdir -p $CPP_BUILD_DIR
> pushd $CPP_BUILD_DIR
>
> cmake -GNinja \
> -DCMAKE_BUILD_TYPE=Release \
> -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
> -DCMAKE_INSTALL_LIBDIR=lib \
> -DCMAKE_UNITY_BUILD=ON \
> -DARROW_BUILD_STATIC=OFF \
> -DARROW_COMPUTE=ON \
> -DARROW_CSV=ON \
> -DARROW_FILESYSTEM=ON \
> -DARROW_JSON=ON \
> $ARROW_ROOT/cpp
>
> ninja install
>
> popd
>
> #----------------------------------------------------------------------
> # Build and test Python library
> pushd $ARROW_ROOT/python
>
> rm -rf build/ # remove any pesky pre-existing build directory
>
> export 
> CMAKE_PREFIX_PATH=${ARROW_HOME}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}
> export PYARROW_BUILD_TYPE=Release
> export PYARROW_CMAKE_GENERATOR=Ninja
>
> # You can run either "develop" or "build_ext --inplace". Your pick
>
> python setup.py build_ext --inplace
> # python setup.py develop
>
> # pip install -r $ARROW_ROOT/python/requirements-test.txt
>
> # py.test pyarrow
>
> `
>
>
>
> I would be very thankful for any help and advice that you can offer.
>
> Thank you very much,
>
> Shara
>
>
> On 2023/11/22 14:29:49 Raúl Cumplido wrote:
> > Hi Shara,
> >
> > The example dockerfile installs the base requirements for Ubuntu but
> > then we use the build_venv.sh (or build_conda.sh) to build the Arrow
> > CPP library and then pyarrow [1].
> >
> > From the error it seems you did not build Arrow CPP as libarrow.so
> > can't be found. Can you try following the recipe on the provided sh
> > file?
> >
> > Kind regards,
> > Raúl
> >
> > [1] 
> > https://github.com/apache/arrow/blob/main/python/examples/minimal_build/build_venv.sh
> >
> > El mar, 21 nov 2023 a las 23:05, Akshara Sadheesh
> > (<sh...@gmail.com>) escribió:
> > >
> > > Hi,
> > >
> > > I have been trying to use the minimal_build for python with the
> > > provided examples Dockerfile.ubuntu for my lambda layers since it has
> > > a 250 MB limit. I am able to run the build and generate a pyarrow
> > > library. However, the library does not contain any shared .so files.
> > > When in use, it says:
> > >
> > > `"Unable to import module 'lambda_function': libarrow.so.1500: cannot
> > > open shared object file: No such file or directory"`
> > >
> > > I modified the Dockerfile to use python 3.10, ubuntu image to 22.04
> > > and set the `--platform linux/x86_64` when building the image to
> > > ensure it is compatible with the lambda architecture.
> > >
> > > I would be very grateful if you could help me with this,
> > >
> > > Thank you!
> > >
> > > Shara
> >

Sent from my iPhone

Reply via email to