[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r825277731 ## File path: Dockerfile ## @@ -49,25 +49,33 @@ RUN mvn -Dmaven.artifact.threads=5 -T1C clean install -DskipTests # Get project version and copy built binaries into /opt/drill directory RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec) \ && mkdir /opt/drill \ - && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill + && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill \ + && chmod -R +r /opt/drill # Target image # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +# Starts Drill in embedded mode and connects to Sqlline +ENTRYPOINT $DRILL_HOME/bin/drill-embedded -RUN mkdir $DRILL_HOME +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN groupadd -g 999 $DRILL_USER \ - && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d /var/lib/drill \ - && chown -R $DRILL_USER: $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL -USER $DRILL_USER +RUN groupadd -g 999 $DRILL_USER \ + && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d $DRILL_USER_HOME \ + && chown $DRILL_USER: $DATA_VOL -COPY --from=build --chown=$DRILL_USER /opt/drill $DRILL_HOME +# A Docker volume where users may store persistent data, e.g. persistent Drill +# config by specifying a Drill BOOT option of sys.store.provider.local.path: "/data". +VOLUME $DATA_VOL -# Starts Drill in embedded mode and connects to Sqlline -ENTRYPOINT $DRILL_HOME/bin/drill-embedded Review comment: @vvysotskyi no it was just one of the things that it was possible to move above the COPY :) I don't think that this particular move helped with image size at all so if we prefer ENTRYPOINT at the end then I can move it back. Containers launched from this image do still correctly start up drill-embedded even with the ENTRYPOINT higher up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r825278013 ## File path: Dockerfile ## @@ -49,25 +49,33 @@ RUN mvn -Dmaven.artifact.threads=5 -T1C clean install -DskipTests # Get project version and copy built binaries into /opt/drill directory RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec) \ && mkdir /opt/drill \ - && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill + && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill \ + && chmod -R +r /opt/drill # Target image # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +# Starts Drill in embedded mode and connects to Sqlline +ENTRYPOINT $DRILL_HOME/bin/drill-embedded -RUN mkdir $DRILL_HOME +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN groupadd -g 999 $DRILL_USER \ - && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d /var/lib/drill \ - && chown -R $DRILL_USER: $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL -USER $DRILL_USER +RUN groupadd -g 999 $DRILL_USER \ Review comment: Yes, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r825277731 ## File path: Dockerfile ## @@ -49,25 +49,33 @@ RUN mvn -Dmaven.artifact.threads=5 -T1C clean install -DskipTests # Get project version and copy built binaries into /opt/drill directory RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --non-recursive exec:exec) \ && mkdir /opt/drill \ - && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill + && mv distribution/target/apache-drill-${VERSION}/apache-drill-${VERSION}/* /opt/drill \ + && chmod -R +r /opt/drill # Target image # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +# Starts Drill in embedded mode and connects to Sqlline +ENTRYPOINT $DRILL_HOME/bin/drill-embedded -RUN mkdir $DRILL_HOME +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN groupadd -g 999 $DRILL_USER \ - && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d /var/lib/drill \ - && chown -R $DRILL_USER: $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL -USER $DRILL_USER +RUN groupadd -g 999 $DRILL_USER \ + && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d $DRILL_USER_HOME \ + && chown $DRILL_USER: $DATA_VOL -COPY --from=build --chown=$DRILL_USER /opt/drill $DRILL_HOME +# A Docker volume where users may store persistent data, e.g. persistent Drill +# config by specifying a Drill BOOT option of sys.store.provider.local.path: "/data". +VOLUME $DATA_VOL -# Starts Drill in embedded mode and connects to Sqlline -ENTRYPOINT $DRILL_HOME/bin/drill-embedded Review comment: @vvysotskyi no it was just one of the things that it was possible to move above the COPY :) I don't this particular move helped with image size at all so if we prefer ENTRYPOINT at the end then I can move it back. Containers launched from this image do still correctly start up drill-embedded even with the ENTRYPOINT higher up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r823514983 ## File path: Dockerfile ## @@ -56,17 +56,26 @@ RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --n # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN mkdir $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL + +COPY --from=build /opt/drill $DRILL_HOME Review comment: @vvysotskyi in the final commit I found and fixed the size blowup. The `RUN chmod` command was responsible for duplicating the entire Drill installation just to set file attributes, pretty lame CoW system that Docker has there. Anyway, now the `RUN chmod` is done in the intermediate container. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r823489048 ## File path: Dockerfile ## @@ -56,17 +56,26 @@ RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --n # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN mkdir $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL + +COPY --from=build /opt/drill $DRILL_HOME Review comment: @vvysotskyi I don't think Docker is meant to duplicate data across layers this way. I think that each layer is supposed to be stored as a delta from the previous layer (even though it may be reported as having the cumulative size of the layers up to that point). So the layer ordering should not affect the size of the final image. Neverthess I have moved everything that I could above the COPY in the Dockerfile and I do still worry about a size blowup because when I list images I see 1.47GB for the image from this Dockerfile, while pulling apache/drill:1.20.0-openjdk-8 gives me an image smaller than 1GB. ``` apache/drill snapshot-openjdk-8 57306e5337db 3 minutes ago 1.47GB apache/drill 1.20.0-openjdk-8 7479402ba1b3 6 days ago 983MB ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [drill] jnturton commented on a change in pull request #2491: DRILL-8156: Declare and chown a /data VOLUME in the Drill Dockerfile
jnturton commented on a change in pull request #2491: URL: https://github.com/apache/drill/pull/2491#discussion_r823462959 ## File path: Dockerfile ## @@ -56,17 +56,26 @@ RUN VERSION=$(mvn -q -Dexec.executable=echo -Dexec.args='${project.version}' --n # Set the BASE_IMAGE build arg when you invoke docker build. FROM $BASE_IMAGE -ENV DRILL_HOME=/opt/drill DRILL_USER=drilluser +ENV DRILL_HOME=/opt/drill +ENV DRILL_USER=drilluser +ENV DRILL_USER_HOME=/var/lib/drill +ENV DRILL_LOG_DIR=$DRILL_USER_HOME/log +ENV DATA_VOL=/data -RUN mkdir $DRILL_HOME +RUN mkdir $DRILL_HOME $DATA_VOL + +COPY --from=build /opt/drill $DRILL_HOME RUN groupadd -g 999 $DRILL_USER \ - && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d /var/lib/drill \ - && chown -R $DRILL_USER: $DRILL_HOME + && useradd -r -u 999 -g $DRILL_USER $DRILL_USER -m -d $DRILL_USER_HOME \ + && chown $DRILL_USER: $DATA_VOL \ + && chmod -R +r $DRILL_HOME Review comment: @vvysotskyi it introduces a dependency on something called BuildKit, output from my attempt to use this flag: ``` COPY --from=build --chmod=0755 /opt/drill $DRILL_HOME the --chmod option requires BuildKit. Refer to https://docs.docker.com/go/buildkit/ to learn how to build images with BuildKit enabled ``` Do you still think it's worth it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org