HyukjinKwon commented on code in PR #40664:
URL: https://github.com/apache/spark/pull/40664#discussion_r1157883873
##########
dev/infra/Dockerfile:
##########
@@ -64,8 +64,8 @@ RUN Rscript -e "devtools::install_version('roxygen2',
version='7.2.0', repos='ht
# See more in SPARK-39735
ENV R_LIBS_SITE
"/usr/local/lib/R/site-library:${R_LIBS_SITE}:/usr/lib/R/library"
-RUN pypy3 -m pip install numpy 'pandas<=1.5.3' scipy coverage matplotlib
-RUN python3.9 -m pip install numpy pyarrow 'pandas<=1.5.3' scipy
unittest-xml-reporting plotly>=4.8 scikit-learn 'mlflow>=1.0' coverage
matplotlib openpyxl 'memory-profiler==0.60.0' 'scikit-learn==1.1.*'
+RUN pypy3 -m pip install numpy 'pandas=>2.0.0' scipy coverage matplotlib
Review Comment:
In this way with `=>`, pandas will be automatically upgraded whenever it's
released, and it will break CI if there's a breaking change. Let's go with
https://github.com/apache/spark/pull/40658 one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]