Thank you so much for your reply Raul! So I did run the build using the 
build_venv.sh file. The issue was I think I did not copy over the libarrow.so 
files from my docker container in the `root/dist/lib` directory. I have added 
them onto `arrow/python/pyarrow`. 

After the build finished I copied over the libarrow.so files from 
`root/dist/lib` in my container to my host machine and added the libarrow.so 
files to the `arrow/python/pyarrow` folder. This got rid of the missing 
libarrow.so files error.

I then added this new pyarrow folder to my lambda layers folder, the deploy.sh 
script will take care of building out the new environment using codebuild. I am 
using a managed Ubuntu Standard 6.0 image 
(https://github.com/aws/aws-codebuild-docker-images/blob/master/ubuntu/standard/6.0/Dockerfile).
 This uses glibc version 2.35. As much as possible I would like to avoid 
changing the glibc version for this as it is a managed image.

Issue:

The issue is when I add the custom pyarrow to my lambda layers and run the step 
function I get this error:

`GLIBC_2.32* not found (required by 
/opt/python/pyarrow/lib.cpython-310-x86_64-linux-gnu.so 
<http://lib.cpython-310-x86_64-linux-gnu.so/>)`

I keep bumping into a glibc version error. This error is present even after 
modifying the Dockerfile to use the same base image the code build managed 
image uses with GLIBC 2.35. 

This is the modified `arrow/python/examples/minimal_build/Dockerfile.ubuntu` 
used:

`

FROM public.ecr.aws/ubuntu/ubuntu:22.04

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update -y -q && \
    apt-get install -y -q --no-install-recommends \
        apt-transport-https \
        software-properties-common \
        wget && \
    apt-get install -y -q --no-install-recommends \
      build-essential \
      cmake \
      git \
      ninja-build \
      python3.10 \
      python3.10-dev \
      python3.10-venv \
      && \
      apt-get clean && rm -rf /var/lib/apt/lists*

# Set Python 3.10 as the default Python version
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

RUN wget https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py && \
    rm get-pip.py

`

This is the `arrow/python/examples/minimal_build/build_venv.sh` used:


`

#!/usr/bin/env bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

set -e

#----------------------------------------------------------------------
# Change this to whatever makes sense for your system

WORKDIR=${WORKDIR:-$HOME}
MINICONDA=$WORKDIR/miniconda-for-arrow
LIBRARY_INSTALL_DIR=$WORKDIR/local-libs
CPP_BUILD_DIR=$WORKDIR/arrow-cpp-build
ARROW_ROOT=/arrow
export ARROW_HOME=$WORKDIR/dist
export LD_LIBRARY_PATH=$ARROW_HOME/lib:$LD_LIBRARY_PATH

python3 -m venv $WORKDIR/venv
source $WORKDIR/venv/bin/activate

git config --global --add safe.directory $ARROW_ROOT

pip install -r $ARROW_ROOT/python/requirements-build.txt

#----------------------------------------------------------------------
# Build C++ library

mkdir -p $CPP_BUILD_DIR
pushd $CPP_BUILD_DIR

cmake -GNinja \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
      -DCMAKE_INSTALL_LIBDIR=lib \
      -DCMAKE_UNITY_BUILD=ON \
      -DARROW_BUILD_STATIC=OFF \
      -DARROW_COMPUTE=ON \
      -DARROW_CSV=ON \
      -DARROW_FILESYSTEM=ON \
      -DARROW_JSON=ON \
      $ARROW_ROOT/cpp

ninja install

popd

#----------------------------------------------------------------------
# Build and test Python library
pushd $ARROW_ROOT/python

rm -rf build/  # remove any pesky pre-existing build directory

export 
CMAKE_PREFIX_PATH=${ARROW_HOME}${CMAKE_PREFIX_PATH:+:${CMAKE_PREFIX_PATH}}
export PYARROW_BUILD_TYPE=Release
export PYARROW_CMAKE_GENERATOR=Ninja

# You can run either "develop" or "build_ext --inplace". Your pick

python setup.py build_ext --inplace
# python setup.py develop

# pip install -r $ARROW_ROOT/python/requirements-test.txt

# py.test pyarrow

`



I would be very thankful for any help and advice that you can offer.

Thank you very much,

Shara


On 2023/11/22 14:29:49 Raúl Cumplido wrote:
> Hi Shara,
> 
> The example dockerfile installs the base requirements for Ubuntu but
> then we use the build_venv.sh (or build_conda.sh) to build the Arrow
> CPP library and then pyarrow [1].
> 
> From the error it seems you did not build Arrow CPP as libarrow.so
> can't be found. Can you try following the recipe on the provided sh
> file?
> 
> Kind regards,
> Raúl
> 
> [1] 
> https://github.com/apache/arrow/blob/main/python/examples/minimal_build/build_venv.sh
> 
> El mar, 21 nov 2023 a las 23:05, Akshara Sadheesh
> (<sh...@gmail.com>) escribió:
> >
> > Hi,
> >
> > I have been trying to use the minimal_build for python with the
> > provided examples Dockerfile.ubuntu for my lambda layers since it has
> > a 250 MB limit. I am able to run the build and generate a pyarrow
> > library. However, the library does not contain any shared .so files.
> > When in use, it says:
> >
> > `"Unable to import module 'lambda_function': libarrow.so.1500: cannot
> > open shared object file: No such file or directory"`
> >
> > I modified the Dockerfile to use python 3.10, ubuntu image to 22.04
> > and set the `--platform linux/x86_64` when building the image to
> > ensure it is compatible with the lambda architecture.
> >
> > I would be very grateful if you could help me with this,
> >
> > Thank you!
> >
> > Shara
> 

Reply via email to