HyukjinKwon opened a new pull request #29188:
URL: https://github.com/apache/spark/pull/29188


   ### What changes were proposed in this pull request?
   
   This PR proposes to redesign the PySpark documentation.
   
   I made a demo site to make it easier to review: 
https://hyukjin-spark.readthedocs.io/en/stable/reference/index.html.
   
   Here is the initial draft for the final PySpark docs shape: 
https://hyukjin-spark.readthedocs.io/en/latest/index.html.
   
   In more details, this PR proposes:
   1. Use 
[pydata_sphinx_theme](https://github.com/pandas-dev/pydata-sphinx-theme) theme 
- [pandas](https://pandas.pydata.org/docs/) and 
[Koalas](https://koalas.readthedocs.io/en/latest/) use this theme. The CSS 
overwrite is ported from Koalas. The colours in the CSS were actually chosen by 
designers to use in Spark.
   2. Use the Sphinx option to separate `source` and `build` directories as the 
documentation pages will likely grow. 
   3. Port current API documentation into the new style. It mimics Koalas and 
pandas to use the theme most effectively.
   
       One disadvantage of this approach is that you should list up APIs or 
classes; however, I think this isn't a big issue in PySpark since we're being 
conservative on adding APIs. I also intentionally listed classes only instead 
of functions in ML and MLlib to make it relatively easier to manage.
   
   ### Why are the changes needed?
   
   Often I hear the complaints, from the users, that current PySpark 
documentation is pretty messy to read - 
https://spark.apache.org/docs/latest/api/python/index.html compared other 
projects such as [pandas](https://pandas.pydata.org/docs/) and 
[Koalas](https://koalas.readthedocs.io/en/latest/).
   
   It would be nicer if we can make it more organised instead of just listing 
all classes, methods and attributes to make it easier to navigate.
   
   Also, the documentation has been there from almost the very first version of 
PySpark. Maybe it's time to update it.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, PySpark API documentation will be redesigned.
   
   ### How was this patch tested?
   
   Manually tested, and the demo site was made to show.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to