nchammas opened a new pull request, #44865: URL: https://github.com/apache/spark/pull/44865
### What changes were proposed in this pull request? As [suggested here][1], this change improves the documentation build so that it builds Spark at most one time, regardless of what API docs are requested in the build. [1]: https://github.com/apache/spark/pull/44791#discussion_r1459233153 ### Why are the changes needed? There is no need to build Spark multiple times when generating docs. In particular, building Scala and Python docs, or Scala and SQL docs, causes Spark to be built twice. Fixing this problem saves us a couple of minutes. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I built the docs as follows on `master` as well as on this branch: ```sh time SKIP_RDOC=1 SKIP_PYTHONDOC=1 bundle exec jekyll build ``` The time results are as follows: ``` Before this change ------------------ real 6m48.815s user 23m17.943s sys 1m29.578s After this change ----------------- real 4m10.672s user 14m10.130s sys 1m0.773s ``` Additionally, I diffed the generated `_site/` dir across `master` and this branch and confirmed they are essential identical except for some general SQL examples files. ### Was this patch authored or co-authored using generative AI tooling? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
