Eric Yang created HDDS-1648:
-------------------------------

             Summary: Reduce Ozone docker image bloat
                 Key: HDDS-1648
                 URL: https://issues.apache.org/jira/browse/HDDS-1648
             Project: Hadoop Distributed Data Store
          Issue Type: Sub-task
            Reporter: Eric Yang


Docker image can be more lean if multiple steps are group together and run by a 
shell script.  For example, all the install commands can be wrapped by a setup 
shell script for Hadoop-runner.

{code}
#!/bin/bash

rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y sudo python2-pip wget nmap-ncat jq java-11-openjdk
pip install robotframework
wget -O /usr/local/bin/dumb-init 
https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64
chmod +x /usr/local/bin/dumb-init
mkdir -p /etc/security/keytabs && chmod -R a+wr /etc/security/keytabs 
wget -O /opt/byteman.jar 
https://repo.maven.apache.org/maven2/org/jboss/byteman/byteman/4.0.4/byteman-4.0.4.jar
chmod o+r /opt/byteman.jar
mkdir -p /opt/profiler && \
    cd /opt/profiler && \
    curl -L 
https://github.com/jvm-profiling-tools/async-profiler/releases/download/v1.5/async-profiler-1.5-linux-x64.tar.gz
 | tar xvz
yum install -y krb5-workstation
mkdir -p /etc/hadoop && mkdir -p /var/log/hadoop && chmod 1777 /etc/hadoop && 
chmod 1777 /var/log/hadoop
{code}

And Dockerfile is simplified to:
{code}
FROM centos
ADD setup.sh /
RUN /setup.sh
ADD scripts /opt/
ADD scripts/krb5.conf /etc/
WORKDIR /opt/hadoop
ENV HADOOP_LOG_DIR=/var/log/hadoop
ENV HADOOP_CONF_DIR=/etc/hadoop
ENTRYPOINT ["/usr/local/bin/dumb-init", "--", "/opt/starter.sh"]
{code}

This arrangement can drastically improve the rebuild performance of Docker 
image.  The end result of the image is 150MB less than current hadoop-runner 
image on Github.  The reduced intermediate layers shrinks the reference count 
number to improve space usage.

We can also have two scripts, one for install binaries, and another one for 
configure the image.  This can even further reduce the build time, if the third 
party binaries rarely changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to