[ 
https://issues.apache.org/jira/browse/ARROW-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352985#comment-16352985
 ] 

ASF GitHub Bot commented on ARROW-2086:
---------------------------------------

wesm closed pull request #1555: ARROW-2086: [Python] Shrink size of 
arrow_manylinux1_x86_64_base docker image
URL: https://github.com/apache/arrow/pull/1555
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/manylinux1/Dockerfile-x86_64 
b/python/manylinux1/Dockerfile-x86_64
index 9c00e7ea2..98b559535 100644
--- a/python/manylinux1/Dockerfile-x86_64
+++ b/python/manylinux1/Dockerfile-x86_64
@@ -14,7 +14,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-FROM quay.io/xhochy/arrow_manylinux1_x86_64_base:ARROW-2048
+FROM quay.io/xhochy/arrow_manylinux1_x86_64_base:ARROW-2086
 
 ADD arrow /arrow
 WORKDIR /arrow/cpp
diff --git a/python/manylinux1/Dockerfile-x86_64_base 
b/python/manylinux1/Dockerfile-x86_64_base
index ec7893080..b7687533a 100644
--- a/python/manylinux1/Dockerfile-x86_64_base
+++ b/python/manylinux1/Dockerfile-x86_64_base
@@ -17,7 +17,7 @@
 FROM quay.io/pypa/manylinux1_x86_64:latest
 
 # Install dependencies
-RUN yum install -y flex zlib-devel
+RUN yum install -y flex zlib-devel && yum clean all
 
 ADD scripts/build_openssl.sh /
 RUN /build_openssl.sh
diff --git a/python/manylinux1/scripts/build_boost.sh 
b/python/manylinux1/scripts/build_boost.sh
index 4650cde95..1a6ffd7eb 100755
--- a/python/manylinux1/scripts/build_boost.sh
+++ b/python/manylinux1/scripts/build_boost.sh
@@ -23,6 +23,16 @@ wget --no-check-certificate 
https://dl.bintray.com/boostorg/release/${BOOST_VERS
 tar xf boost_${BOOST_VERSION_UNDERSCORE}.tar.gz
 pushd /boost_${BOOST_VERSION_UNDERSCORE}
 ./bootstrap.sh
-./bjam cxxflags=-fPIC cflags=-fPIC --prefix=/usr --with-filesystem 
--with-date_time --with-system --with-regex install
+./bjam cxxflags=-fPIC cflags=-fPIC variant=release link=static --prefix=/usr 
--with-filesystem --with-date_time --with-system --with-regex install
 popd
 rm -rf boost_${BOOST_VERSION_UNDERSCORE}.tar.gz 
boost_${BOOST_VERSION_UNDERSCORE}
+# Boost always install header-only parts but they also take up quite some 
space.
+# We don't need them in array, so don't persist them in the docker layer.
+# phoenix 18.1 MiB
+rm -r /usr/include/boost/phoenix
+# fusion 16.7 MiB
+rm -r /usr/include/boost/fusion
+# spirit 8.2 MiB
+rm -r /usr/include/boost/spirit
+# geometry 6.0 MiB
+rm -r /usr/include/boost/geometry
diff --git a/python/manylinux1/scripts/build_virtualenvs.sh 
b/python/manylinux1/scripts/build_virtualenvs.sh
index ddedcf61f..e64157065 100755
--- a/python/manylinux1/scripts/build_virtualenvs.sh
+++ b/python/manylinux1/scripts/build_virtualenvs.sh
@@ -44,3 +44,11 @@ for PYTHON in ${PYTHON_VERSIONS}; do
     pip install pytest 'numpy==1.12.1' 'pandas==0.20.1'
     deactivate
 done
+
+# Remove pip cache again. It's useful during the virtualenv creation but we
+# don't want it persisted in the docker layer, ~264MiB
+rm -rf /root/.cache
+# Remove pandas' tests module as it includes a lot of data, ~27MiB per Python
+# venv, i.e. 216MiB in total
+rm -rf /opt/_internal/*/lib/*/site-packages/pandas/tests
+rm -rf /venv-test-*/lib/*/site-packages/pandas/tests


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [Python] Shrink size of arrow_manylinux1_x86_64_base docker image
> -----------------------------------------------------------------
>
>                 Key: ARROW-2086
>                 URL: https://issues.apache.org/jira/browse/ARROW-2086
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Uwe L. Korn
>            Assignee: Uwe L. Korn
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to