nchammas commented on code in PR #43953: URL: https://github.com/apache/spark/pull/43953#discussion_r1426980392
########## dev/infra/Dockerfile: ########## @@ -139,3 +139,60 @@ RUN python3.12 -m pip install 'grpcio==1.59.3' 'grpcio-status==1.59.3' 'protobuf # TODO(SPARK-46078) Use official one instead of nightly build when it's ready RUN python3.12 -m pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu RUN python3.12 -m pip install torcheval + + +# Refer to https://github.com/ContinuumIO/docker-images/blob/main/miniconda3/debian/Dockerfile +RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh -q && \ + bash miniconda.sh -b -p /opt/miniconda3 && \ + rm miniconda.sh && \ + ln -s /opt/miniconda3/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \ + ln -s /opt/miniconda3/bin/conda /usr/local/bin/conda && \ + find /opt/miniconda3/ -follow -type f -name '*.a' -delete && \ + find /opt/miniconda3/ -follow -type f -name '*.js.map' -delete && \ + conda clean -afy + +# Additional Python deps for linter and documentation, delete this section if another Python version is used +# Since there maybe conflicts between envs, here uses conda to manage it. +# TODO(SPARK-32407): Sphinx 3.1+ does not correctly index nested classes. +# See also https://github.com/sphinx-doc/sphinx/issues/7551. +# Jinja2 3.0.0+ causes error when building with Sphinx. +# See also https://issues.apache.org/jira/browse/SPARK-35375. +RUN conda create -n doc python=3.9 + +RUN conda run -n doc pip install \ Review Comment: > As to why not use requirement file in CI, I guess a problem maybe, the modification in requirement file won't automatically refresh the cached testing image? That shouldn't be the case. Assuming you `COPY` the requirements file into the image, changing the file will [invalidate the cache][1]: > The first encountered `COPY` instruction will invalidate the cache for all following instructions from the Dockerfile if the contents of `<src>` have changed. This includes invalidating the cache for `RUN` instructions. [Also][2]: > For the `ADD` and `COPY` instructions, the modification time and size file metadata is used to determine whether cache is valid. During cache lookup, cache is invalidated if the file metadata has changed for any of the files involved. [1]: https://docs.docker.com/engine/reference/builder/#copy [2]: https://docs.docker.com/develop/develop-images/guidelines/#leverage-build-cache -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
