HyukjinKwon commented on a change in pull request #27534: [SPARK-30879][DOCS] Refine workflow for building docs URL: https://github.com/apache/spark/pull/27534#discussion_r400686586
########## File path: dev/create-release/spark-rm/Dockerfile ########## @@ -50,36 +46,43 @@ RUN apt-get clean && apt-get update && $APT_INSTALL gnupg ca-certificates && \ rm -rf /var/lib/apt/lists/* && \ apt-get clean && \ apt-get update && \ - $APT_INSTALL software-properties-common && \ - apt-add-repository -y ppa:brightbox/ruby-ng && \ - apt-get update && \ # Install openjdk 8. $APT_INSTALL openjdk-8-jdk && \ update-alternatives --set java /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java && \ # Install build / source control tools $APT_INSTALL curl wget git maven ivy subversion make gcc lsof libffi-dev \ - pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev && \ + pandoc pandoc-citeproc libssl-dev libcurl4-openssl-dev libxml2-dev + +ENV PATH "$PATH:/root/.pyenv/bin:/root/.pyenv/shims" +RUN curl -L https://github.com/pyenv/pyenv-installer/raw/dd3f7d0914c5b4a416ca71ffabdf2954f2021596/bin/pyenv-installer | bash Review comment: Okay, I suspect it was not tested due to the limitation described in the PR description: ```bash Generating SQL API Markdown files. 20/03/31 06:41:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Traceback (most recent call last): File "/opt/spark-rm/output/spark/sql/gen-sql-api-docs.py", line 21, in <module> from pyspark.java_gateway import launch_gateway File "/opt/spark-rm/output/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 51, in <module> File "/opt/spark-rm/output/spark/python/lib/pyspark.zip/pyspark/context.py", line 22, in <module> ImportError: No module named threading log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. ``` Seems the installed Python is weird. [`threading`]( https://docs.python.org/3.7/library/threading.html) is the standard Python library that has existed from Python 2 to Python 3, but seems not existent with the Python installed here. Let me revert this to make RC preparation easier. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
