snazy commented on code in PR #2957: URL: https://github.com/apache/polaris/pull/2957#discussion_r2485668182
########## plugins/spark/v3.5/regtests/Dockerfile: ########## @@ -18,31 +18,29 @@ # FROM docker.io/apache/spark:3.5.6-java17 -ARG POLARIS_HOST=polaris -ENV POLARIS_HOST=$POLARIS_HOST -ENV SPARK_HOME=/opt/spark -ENV CURRENT_SCALA_VERSION='2.12' -ENV LANGUAGE='en_US:en' + +ARG POLARIS_HOST=polaris \ + CURRENT_SCALA_VERSION=2.12 Review Comment: Similar here, only 2.12 jars in the image. ########## site/docker/Dockerfile: ########## @@ -21,23 +21,14 @@ FROM ubuntu:24.04 AS hugo ENV LANGUAGE='en_US:en' -RUN apt-get update -RUN apt-get install --yes golang hugo asciidoctor npm curl -RUN apt-get clean -# http-server is used when building the static site to manually check it locally -# (via `site/bin/create-static-site.sh --local` at http://localhost:8080/) -RUN npm install --global http-server - -# these dependencies are needed to build the static site -#RUN npm install --global autoprefixer postcss postcss-cli http-server - -RUN mkdir /polaris -RUN mkdir /polaris/site -RUN mkdir /polaris/site/resources +RUN apt-get update && \ + apt-get install -y --no-install-recommends golang hugo asciidoctor npm curl git && \ Review Comment: Why's `git` needed now? ########## plugins/spark/v3.5/getting-started/notebooks/Dockerfile: ########## @@ -19,8 +19,10 @@ FROM docker.io/apache/spark:3.5.6-java17 -ENV PYTHONPATH="${SPARK_HOME}/python/:${SPARK_HOME}/python/lib/py4j-0.10.9.7-src.zip:/home/spark/venv/lib/python3.10/site-packages" -ENV PYSPARK_PYTHON=/home/spark/venv/bin/python +ARG CURRENT_SCALA_VERSION=2.12 Review Comment: What's the purpose of this argument? I only see 2.12 jars in the image. Should this be a static ENV instead? ########## plugins/spark/v3.5/getting-started/notebooks/SparkPolaris.ipynb: ########## @@ -265,7 +265,6 @@ "from pyspark.sql import SparkSession\n", "\n", "spark = (SparkSession.builder\n", - " .config(\"spark.jars\", \"../polaris_libs/polaris-spark-3.5_2.12-1.2.0-incubating-SNAPSHOT-bundle.jar\") # TODO: add a way to automatically discover the Jar\n", Review Comment: Huh? Isn't that jar needed? ########## site/docker/Dockerfile: ########## @@ -21,23 +21,14 @@ FROM ubuntu:24.04 AS hugo ENV LANGUAGE='en_US:en' -RUN apt-get update -RUN apt-get install --yes golang hugo asciidoctor npm curl -RUN apt-get clean -# http-server is used when building the static site to manually check it locally -# (via `site/bin/create-static-site.sh --local` at http://localhost:8080/) -RUN npm install --global http-server - -# these dependencies are needed to build the static site -#RUN npm install --global autoprefixer postcss postcss-cli http-server - -RUN mkdir /polaris -RUN mkdir /polaris/site -RUN mkdir /polaris/site/resources +RUN apt-get update && \ + apt-get install -y --no-install-recommends golang hugo asciidoctor npm curl git && \ + npm install --global http-server && \ + rm -rf /var/lib/apt/lists/* && \ Review Comment: Super-super-super-nit: move the `rm` one line up near to the `apt install`. ########## plugins/spark/v3.5/getting-started/notebooks/SparkPolaris.ipynb: ########## @@ -265,7 +265,6 @@ "from pyspark.sql import SparkSession\n", "\n", "spark = (SparkSession.builder\n", - " .config(\"spark.jars\", \"../polaris_libs/polaris-spark-3.5_2.12-1.2.0-incubating-SNAPSHOT-bundle.jar\") # TODO: add a way to automatically discover the Jar\n", Review Comment: Current version on `main` is `1.3.0-incubating-SNAPSHOT`, so I'm not sure whether this removal breaks more ;) ########## site/docker/Dockerfile: ########## @@ -21,23 +21,14 @@ FROM ubuntu:24.04 AS hugo ENV LANGUAGE='en_US:en' -RUN apt-get update -RUN apt-get install --yes golang hugo asciidoctor npm curl -RUN apt-get clean -# http-server is used when building the static site to manually check it locally -# (via `site/bin/create-static-site.sh --local` at http://localhost:8080/) -RUN npm install --global http-server - -# these dependencies are needed to build the static site -#RUN npm install --global autoprefixer postcss postcss-cli http-server - -RUN mkdir /polaris -RUN mkdir /polaris/site -RUN mkdir /polaris/site/resources +RUN apt-get update && \ + apt-get install -y --no-install-recommends golang hugo asciidoctor npm curl git && \ Review Comment: Mind keeping the long-arg version `--yes`? ########## plugins/spark/v3.5/regtests/Dockerfile: ########## @@ -18,31 +18,29 @@ # FROM docker.io/apache/spark:3.5.6-java17 -ARG POLARIS_HOST=polaris -ENV POLARIS_HOST=$POLARIS_HOST -ENV SPARK_HOME=/opt/spark -ENV CURRENT_SCALA_VERSION='2.12' -ENV LANGUAGE='en_US:en' + +ARG POLARIS_HOST=polaris \ + CURRENT_SCALA_VERSION=2.12 + +ENV POLARIS_HOST=${POLARIS_HOST} \ + CURRENT_SCALA_VERSION=${CURRENT_SCALA_VERSION} USER root -RUN apt update -RUN apt-get install -y diffutils wget curl -RUN mkdir -p /home/spark && \ - chown -R spark /home/spark && \ - mkdir -p /tmp/polaris-regtests && \ - chown -R spark /tmp/polaris-regtests -RUN mkdir /opt/spark/conf && chmod -R 777 /opt/spark/conf -USER spark +RUN apt-get update && \ + apt-get install -y --no-install-recommends diffutils wget curl && \ + rm -rf /var/lib/apt/lists/* && \ + mkdir -p /home/spark /tmp/polaris-regtests /opt/spark/conf && \ + chown -R spark:spark /home/spark /tmp/polaris-regtests && \ + chmod -R 777 /opt/spark/conf WORKDIR /home/spark/polaris -COPY --chown=spark ./v3.5 /home/spark/polaris/v3.5 +COPY --chown=spark:spark ./v3.5 /home/spark/polaris/v3.5 # /home/spark/regtests might not be writable in all situations, see https://github.com/apache/polaris/pull/205 Review Comment: ```suggestion # /home/spark/.../regtests might not be writable in all situations, see https://github.com/apache/polaris/pull/205 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
