dongjoon-hyun edited a comment on pull request #30059: URL: https://github.com/apache/spark/pull/30059#issuecomment-709624882
Ur, why do you think like that, @viirya ? Our plan is [here](https://github.com/apache/spark/pull/30059#issuecomment-709555763). 1. If you are saying about `repo` or `branch`. The project contributors can change PySpark job inside Apache Spark repo in the future. It's the same cost like changing `branch-2.4` or `branch-3.0`. And, it's easier than changing `spark-website`. > As the pre-built image cannot be easily changed like modifying workflow yml file, the project contributors might be a bit harder to change Github Action PySpark job. 2. If you are saying about `Dockerfile`, we already have 7 `Dockerfile` which is open for contributor. It consists of easy linux commands. ``` $ find . -name Dockerfile ./resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile ./resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/python/Dockerfile ./resource-managers/kubernetes/docker/src/main/dockerfiles/spark/bindings/R/Dockerfile ./dev/create-release/spark-rm/Dockerfile ./external/docker/spark-test/master/Dockerfile ./external/docker/spark-test/worker/Dockerfile ./external/docker/spark-test/base/Dockerfile ``` BTW, I want to focus on the improvement which we can get. As you know, I tried various approach in the following PR already, but nothing is better than this. This PR reduces from `3 hours 14 minutes` to `2 hours 23 minutes`. - https://github.com/apache/spark/pull/30012 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
