Eric Yang created HDDS-1648: ------------------------------- Summary: Reduce Ozone docker image bloat Key: HDDS-1648 URL: https://issues.apache.org/jira/browse/HDDS-1648 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Eric Yang
Docker image can be more lean if multiple steps are group together and run by a shell script. For example, all the install commands can be wrapped by a setup shell script for Hadoop-runner. {code} #!/bin/bash rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum install -y sudo python2-pip wget nmap-ncat jq java-11-openjdk pip install robotframework wget -O /usr/local/bin/dumb-init https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 chmod +x /usr/local/bin/dumb-init mkdir -p /etc/security/keytabs && chmod -R a+wr /etc/security/keytabs wget -O /opt/byteman.jar https://repo.maven.apache.org/maven2/org/jboss/byteman/byteman/4.0.4/byteman-4.0.4.jar chmod o+r /opt/byteman.jar mkdir -p /opt/profiler && \ cd /opt/profiler && \ curl -L https://github.com/jvm-profiling-tools/async-profiler/releases/download/v1.5/async-profiler-1.5-linux-x64.tar.gz | tar xvz yum install -y krb5-workstation mkdir -p /etc/hadoop && mkdir -p /var/log/hadoop && chmod 1777 /etc/hadoop && chmod 1777 /var/log/hadoop {code} And Dockerfile is simplified to: {code} FROM centos ADD setup.sh / RUN /setup.sh ADD scripts /opt/ ADD scripts/krb5.conf /etc/ WORKDIR /opt/hadoop ENV HADOOP_LOG_DIR=/var/log/hadoop ENV HADOOP_CONF_DIR=/etc/hadoop ENTRYPOINT ["/usr/local/bin/dumb-init", "--", "/opt/starter.sh"] {code} This arrangement can drastically improve the rebuild performance of Docker image. The end result of the image is 150MB less than current hadoop-runner image on Github. The reduced intermediate layers shrinks the reference count number to improve space usage. We can also have two scripts, one for install binaries, and another one for configure the image. This can even further reduce the build time, if the third party binaries rarely changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org