This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 400db88d00e5 [SPARK-46103][PYTHON][INFRA][BUILD][DOCS] Enhancing 
PySpark documentation
400db88d00e5 is described below

commit 400db88d00e50750513d733be697b6b2dd9043d3
Author: Haejoon Lee <[email protected]>
AuthorDate: Mon Nov 27 08:49:18 2023 +0900

    [SPARK-46103][PYTHON][INFRA][BUILD][DOCS] Enhancing PySpark documentation
    
    ### What changes were proposed in this pull request?
    
    This PR proposes to enhance the PySpark documentation by leveraging modern 
Sphinx features and functionalities. The primary objective is to improve the 
overall user experience and readability of the documentation. To achieve this, 
the PR includes an upgrade of `Sphinx` and `Jinja2` to their newer/latest 
versions, enabling us to use the latest `pydata_sphinx_theme` features such as 
light/dark mode toggling.
    
    ### Why are the changes needed?
    
    Currently, the PySpark documentation is unable to utilize many of the 
advanced features available in recent `Sphinx` versions due to older package 
versions. This limitation hinders the documentation's visual appeal and 
usability, particularly when compared to other projects like Pandas which have 
already adopted these enhancements. For example:
    
    ## Pandas API reference (better layout / switching light & dark mode 
available)
    
    ### Dark mode
    <img width="1409" alt="Screenshot 2023-11-26 at 5 43 29 AM" 
src="https://github.com/apache/spark/assets/44108233/0f97ce4a-c1ec-47fb-9295-445c2d557393";>
    
    ### Light mode
    <img width="1403" alt="Screenshot 2023-11-26 at 5 45 01 AM" 
src="https://github.com/apache/spark/assets/44108233/715f74a8-9e49-4c05-80ef-5531d2e68220";>
    
    ## PySpark API reference (less readable compare to pandas / no light & dark 
mode)
    <img width="1312" alt="Screenshot 2023-11-26 at 5 43 48 AM" 
src="https://github.com/apache/spark/assets/44108233/722d2b61-e231-4387-a5ab-dcd447045d94";>
    
    By updating the `Sphinx` and `Jinja2` versions, we can significantly 
improve the documentation's layout, design, and interactive features, thereby 
enhancing the end-user experience.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No API changes, but users will notice a more modern and user-friendly 
interface in the PySpark documentation. New features like light/dark mode and 
improved page layouts will be available as below:
    
    ## Before
    <img width="1312" alt="Screenshot 2023-11-26 at 5 43 48 AM" 
src="https://github.com/apache/spark/assets/44108233/722d2b61-e231-4387-a5ab-dcd447045d94";>
    
    ## After
    ### Dark mode
    <img width="1388" alt="Screenshot 2023-11-26 at 6 17 13 AM" 
src="https://github.com/apache/spark/assets/44108233/b5ed6cfd-9a65-4c03-a067-b40e89cc8c48";>
    
    ### Light mode
    <img width="1392" alt="Screenshot 2023-11-26 at 6 16 47 AM" 
src="https://github.com/apache/spark/assets/44108233/24b723a7-5b00-4565-81d9-9c87154c115f";>
    
    ### How was this patch tested?
    
    Manually built docs from local environment, and also tested combinations 
between various `Jinja2`, `Sphinx` and `pydata_sphinx_theme` versions for best 
document rendering.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #44012 from itholic/upgrade_sphinx.
    
    Authored-by: Haejoon Lee <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 .github/workflows/build_and_test.yml               |   2 +-
 dev/requirements.txt                               |   6 +--
 python/docs/source/_static/spark-logo-dark.png     | Bin 0 -> 23555 bytes
 python/docs/source/_static/spark-logo-light.png    | Bin 0 -> 18773 bytes
 .../_templates/autosummary/accessor_attribute.rst  |   6 +++
 .../_templates/autosummary/accessor_method.rst     |   6 +++
 .../_templates/autosummary/class_with_docs.rst     |   4 +-
 .../source/_templates/autosummary/plot_class.rst   |  53 +++++++++++++++++++++
 python/docs/source/conf.py                         |   6 ++-
 .../docs/source/reference/pyspark.pandas/frame.rst |   8 +++-
 .../source/reference/pyspark.pandas/indexing.rst   |  12 +++++
 python/docs/source/reference/pyspark.pandas/io.rst |   5 ++
 .../source/reference/pyspark.pandas/series.rst     |  22 ++++++++-
 .../source/reference/pyspark.sql/spark_session.rst |  14 ++++++
 14 files changed, 136 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 5033ab00601a..a4c9ec304258 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -751,7 +751,7 @@ jobs:
         #   See also https://issues.apache.org/jira/browse/SPARK-35375.
         # Pin the MarkupSafe to 2.0.1 to resolve the CI error.
         #   See also https://issues.apache.org/jira/browse/SPARK-38279.
-        python3.9 -m pip install 'sphinx<3.1.0' mkdocs pydata_sphinx_theme 
sphinx-copybutton nbsphinx numpydoc 'jinja2<3.0.0' 'markupsafe==2.0.1' 
'pyzmq<24.0.0'
+        python3.9 -m pip install 'sphinx==4.2.0' mkdocs 
'pydata_sphinx_theme==0.13' sphinx-copybutton nbsphinx numpydoc jinja2 
'markupsafe==2.0.1' 'pyzmq<24.0.0'
         python3.9 -m pip install ipython_genutils # See SPARK-38517
         python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' 
pyarrow pandas 'plotly>=4.8'
         python3.9 -m pip install 'docutils<0.18.0' # See SPARK-39421
diff --git a/dev/requirements.txt b/dev/requirements.txt
index 7de55ec24968..a7af0907c726 100644
--- a/dev/requirements.txt
+++ b/dev/requirements.txt
@@ -31,12 +31,12 @@ pandas-stubs<1.2.0.54
 mkdocs
 
 # Documentation (Python)
-pydata_sphinx_theme
+pydata_sphinx_theme==0.13
 ipython
 nbsphinx
 numpydoc
-jinja2<3.0.0
-sphinx<3.1.0
+jinja2
+sphinx==4.2.0
 sphinx-plotly-directive
 sphinx-copybutton
 docutils<0.18.0
diff --git a/python/docs/source/_static/spark-logo-dark.png 
b/python/docs/source/_static/spark-logo-dark.png
new file mode 100644
index 000000000000..7460faec37fc
Binary files /dev/null and b/python/docs/source/_static/spark-logo-dark.png 
differ
diff --git a/python/docs/source/_static/spark-logo-light.png 
b/python/docs/source/_static/spark-logo-light.png
new file mode 100644
index 000000000000..41938560822c
Binary files /dev/null and b/python/docs/source/_static/spark-logo-light.png 
differ
diff --git a/python/docs/source/_templates/autosummary/accessor_attribute.rst 
b/python/docs/source/_templates/autosummary/accessor_attribute.rst
new file mode 100644
index 000000000000..28a94614b98f
--- /dev/null
+++ b/python/docs/source/_templates/autosummary/accessor_attribute.rst
@@ -0,0 +1,6 @@
+{{ fullname }}
+{{ underline }}
+
+.. currentmodule:: {{ module + "." + objname.split(".")[0] }}
+
+.. autoattribute:: {{ ".".join(objname.split(".")[1:]) }}
diff --git a/python/docs/source/_templates/autosummary/accessor_method.rst 
b/python/docs/source/_templates/autosummary/accessor_method.rst
new file mode 100644
index 000000000000..dce014d7b5da
--- /dev/null
+++ b/python/docs/source/_templates/autosummary/accessor_method.rst
@@ -0,0 +1,6 @@
+{{ fullname }}
+{{ underline }}
+
+.. currentmodule:: {{ module + "." + objname.split(".")[0] }}
+
+.. automethod:: {{ ".".join(objname.split(".")[1:]) }}
diff --git a/python/docs/source/_templates/autosummary/class_with_docs.rst 
b/python/docs/source/_templates/autosummary/class_with_docs.rst
index 7c37b83c0e90..1141fa68a256 100644
--- a/python/docs/source/_templates/autosummary/class_with_docs.rst
+++ b/python/docs/source/_templates/autosummary/class_with_docs.rst
@@ -47,7 +47,9 @@
 
     .. autosummary::
     {% for item in attributes %}
-       ~{{ name }}.{{ item }}
+        {% if not (item == 'uid') %}
+           ~{{ name }}.{{ item }}
+        {% endif %}
     {%- endfor %}
 
     {% endif %}
diff --git a/python/docs/source/_templates/autosummary/plot_class.rst 
b/python/docs/source/_templates/autosummary/plot_class.rst
new file mode 100644
index 000000000000..5e6a73bd0ecc
--- /dev/null
+++ b/python/docs/source/_templates/autosummary/plot_class.rst
@@ -0,0 +1,53 @@
+..  Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+..    http://www.apache.org/licenses/LICENSE-2.0
+
+..  Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+{{ fullname }}
+{{ underline }}
+
+.. currentmodule:: {{ module + "." + objname.split(".")[0] }}
+
+.. automethod:: {{ ".".join(objname.split(".")[1:]) }}
+
+{% if '__init__' in methods %}
+  {% set caught_result = methods.remove('__init__') %}
+{% endif %}
+
+{% block methods %}
+{% if methods %}
+
+   .. rubric:: Methods
+
+   .. autosummary::
+      {% for item in methods %}
+         ~{{ name.split(".")[1] }}.{{ item }}
+      {%- endfor %}
+
+{% endif %}
+{% endblock %}
+
+{% block attributes_summary %}
+{% if attributes %}
+
+   .. rubric:: Attributes
+   
+   .. autosummary::
+      {% for item in attributes %}
+         ~{{ name.split(".")[1] }}.{{ item }}
+      {%- endfor %}
+
+{% endif %}
+{% endblock %}
diff --git a/python/docs/source/conf.py b/python/docs/source/conf.py
index b9884d55b3a1..81083c007b34 100644
--- a/python/docs/source/conf.py
+++ b/python/docs/source/conf.py
@@ -194,7 +194,11 @@ html_context = {
 # further.  For a list of options available for each theme, see the
 # documentation.
 html_theme_options = {
-    "navbar_end": ["version-switcher"]
+    "navbar_end": ["version-switcher", "theme-switcher"],
+    "logo": {
+        "image_light": "_static/spark-logo-light.png",
+        "image_dark": "_static/spark-logo-dark.png",
+    }
 }
 
 # Add any paths that contain custom themes here, relative to this directory.
diff --git a/python/docs/source/reference/pyspark.pandas/frame.rst 
b/python/docs/source/reference/pyspark.pandas/frame.rst
index 911999b56be5..12cf6e7db12f 100644
--- a/python/docs/source/reference/pyspark.pandas/frame.rst
+++ b/python/docs/source/reference/pyspark.pandas/frame.rst
@@ -299,6 +299,7 @@ in Spark. These can be accessed by 
``DataFrame.spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
    DataFrame.spark.frame
    DataFrame.spark.cache
@@ -319,8 +320,8 @@ specific plotting methods of the form 
``DataFrame.plot.<kind>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
-   DataFrame.plot
    DataFrame.plot.area
    DataFrame.plot.barh
    DataFrame.plot.bar
@@ -330,6 +331,10 @@ specific plotting methods of the form 
``DataFrame.plot.<kind>``.
    DataFrame.plot.pie
    DataFrame.plot.scatter
    DataFrame.plot.density
+
+.. autosummary::
+   :toctree: api/
+
    DataFrame.hist
    DataFrame.boxplot
    DataFrame.kde
@@ -341,6 +346,7 @@ These can be accessed by 
``DataFrame.pandas_on_spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
    DataFrame.pandas_on_spark.apply_batch
    DataFrame.pandas_on_spark.transform_batch
diff --git a/python/docs/source/reference/pyspark.pandas/indexing.rst 
b/python/docs/source/reference/pyspark.pandas/indexing.rst
index 7ec4387bb679..301e849ffe28 100644
--- a/python/docs/source/reference/pyspark.pandas/indexing.rst
+++ b/python/docs/source/reference/pyspark.pandas/indexing.rst
@@ -129,8 +129,14 @@ in Spark. These can be accessed by 
``Index.spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_attribute.rst
 
    Index.spark.column
+
+.. autosummary::
+   :toctree: api/
+   :template: autosummary/accessor_method.rst
+
    Index.spark.transform
 
 Sorting
@@ -308,9 +314,15 @@ in Spark. These can be accessed by 
``MultiIndex.spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_attribute.rst
 
    MultiIndex.spark.data_type
    MultiIndex.spark.column
+
+.. autosummary::
+   :toctree: api/
+   :template: autosummary/accessor_method.rst
+
    MultiIndex.spark.transform
 
 MultiIndex Sorting
diff --git a/python/docs/source/reference/pyspark.pandas/io.rst 
b/python/docs/source/reference/pyspark.pandas/io.rst
index 118dd49a4ada..fd41a03699ca 100644
--- a/python/docs/source/reference/pyspark.pandas/io.rst
+++ b/python/docs/source/reference/pyspark.pandas/io.rst
@@ -69,6 +69,11 @@ Generic Spark I/O
    :toctree: api/
 
    read_spark_io
+
+.. autosummary::
+   :toctree: api/
+   :template: autosummary/accessor_method.rst
+
    DataFrame.spark.to_spark_io
 
 Flat File / CSV
diff --git a/python/docs/source/reference/pyspark.pandas/series.rst 
b/python/docs/source/reference/pyspark.pandas/series.rst
index 01fb5aa87fb1..88d1861c6ccf 100644
--- a/python/docs/source/reference/pyspark.pandas/series.rst
+++ b/python/docs/source/reference/pyspark.pandas/series.rst
@@ -270,8 +270,14 @@ in Spark. These can be accessed by 
``Series.spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_attribute.rst
 
    Series.spark.column
+
+.. autosummary::
+   :toctree: api/
+   :template: autosummary/accessor_method.rst
+
    Series.spark.transform
    Series.spark.apply
 
@@ -304,6 +310,7 @@ Datetime Properties
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_attribute.rst
 
    Series.dt.date
    Series.dt.year
@@ -333,6 +340,7 @@ Datetime Methods
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
    Series.dt.normalize
    Series.dt.strftime
@@ -353,6 +361,7 @@ like ``Series.str.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
    Series.str.capitalize
    Series.str.cat
@@ -416,10 +425,16 @@ the ``Series.cat`` accessor.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_attribute.rst
 
    Series.cat.categories
    Series.cat.ordered
    Series.cat.codes
+
+.. autosummary::
+   :toctree: api/
+   :template: autosummary/accessor_method.rst
+
    Series.cat.rename_categories
    Series.cat.reorder_categories
    Series.cat.add_categories
@@ -438,8 +453,8 @@ specific plotting methods of the form 
``Series.plot.<kind>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
-   Series.plot
    Series.plot.area
    Series.plot.bar
    Series.plot.barh
@@ -449,6 +464,10 @@ specific plotting methods of the form 
``Series.plot.<kind>``.
    Series.plot.line
    Series.plot.pie
    Series.plot.kde
+
+.. autosummary::
+   :toctree: api/
+
    Series.hist
 
 Serialization / IO / Conversion
@@ -476,6 +495,7 @@ These can be accessed by 
``Series.pandas_on_spark.<function/property>``.
 
 .. autosummary::
    :toctree: api/
+   :template: autosummary/accessor_method.rst
 
    Series.pandas_on_spark.transform_batch
 
diff --git a/python/docs/source/reference/pyspark.sql/spark_session.rst 
b/python/docs/source/reference/pyspark.sql/spark_session.rst
index f25dbab5f6b9..f242e4439cf4 100644
--- a/python/docs/source/reference/pyspark.sql/spark_session.rst
+++ b/python/docs/source/reference/pyspark.sql/spark_session.rst
@@ -29,12 +29,21 @@ See also :class:`SparkSession`.
     :toctree: api/
 
     SparkSession.active
+
+.. autosummary::
+    :toctree: api/
+    :template: autosummary/accessor_method.rst
+
     SparkSession.builder.appName
     SparkSession.builder.config
     SparkSession.builder.enableHiveSupport
     SparkSession.builder.getOrCreate
     SparkSession.builder.master
     SparkSession.builder.remote
+
+.. autosummary::
+    :toctree: api/
+
     SparkSession.catalog
     SparkSession.conf
     SparkSession.createDataFrame
@@ -58,8 +67,13 @@ Spark Connect Only
 
 .. autosummary::
     :toctree: api/
+    :template: autosummary/accessor_method.rst
 
     SparkSession.builder.create
+
+.. autosummary::
+    :toctree: api/
+
     SparkSession.addArtifact
     SparkSession.addArtifacts
     SparkSession.copyFromLocalToFs


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to