nchammas commented on code in PR #43953:
URL: https://github.com/apache/spark/pull/43953#discussion_r1426980392


##########
dev/infra/Dockerfile:
##########
@@ -139,3 +139,60 @@ RUN python3.12 -m pip install 'grpcio==1.59.3' 
'grpcio-status==1.59.3' 'protobuf
 # TODO(SPARK-46078) Use official one instead of nightly build when it's ready
 RUN python3.12 -m pip install --pre torch --index-url 
https://download.pytorch.org/whl/nightly/cpu
 RUN python3.12 -m pip install torcheval
+
+
+# Refer to 
https://github.com/ContinuumIO/docker-images/blob/main/miniconda3/debian/Dockerfile
+RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 
-O miniconda.sh -q && \
+    bash miniconda.sh -b -p /opt/miniconda3 && \
+    rm miniconda.sh && \
+    ln -s /opt/miniconda3/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \
+    ln -s /opt/miniconda3/bin/conda /usr/local/bin/conda && \
+    find /opt/miniconda3/ -follow -type f -name '*.a' -delete && \
+    find /opt/miniconda3/ -follow -type f -name '*.js.map' -delete && \
+    conda clean -afy
+
+# Additional Python deps for linter and documentation, delete this section if 
another Python version is used
+# Since there maybe conflicts between envs, here uses conda to manage it.
+# TODO(SPARK-32407): Sphinx 3.1+ does not correctly index nested classes.
+#   See also https://github.com/sphinx-doc/sphinx/issues/7551.
+# Jinja2 3.0.0+ causes error when building with Sphinx.
+#   See also https://issues.apache.org/jira/browse/SPARK-35375.
+RUN conda create -n doc python=3.9
+
+RUN conda run -n doc pip install \

Review Comment:
   > As to why not use requirement file in CI, I guess a problem maybe, the 
modification in requirement file won't automatically refresh the cached testing 
image?
   
   That shouldn't be the case. Assuming you `COPY` the requirements file into 
the image, changing the file will [invalidate the cache][1]:
   
   > The first encountered `COPY` instruction will invalidate the cache for all 
following instructions from the Dockerfile if the contents of `<src>` have 
changed. This includes invalidating the cache for `RUN` instructions. 
   
   [Also][2]:
   
   > For the `ADD` and `COPY` instructions, the modification time and size file 
metadata is used to determine whether cache is valid. During cache lookup, 
cache is invalidated if the file metadata has changed for any of the files 
involved.
   
   [1]: https://docs.docker.com/engine/reference/builder/#copy
   [2]: 
https://docs.docker.com/develop/develop-images/guidelines/#leverage-build-cache



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to