[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36749: [SPARK-39295][DOCS][PYTHON][3.3] Improve documentation of pandas API supported list

GitBox Thu, 02 Jun 2022 05:03:55 -0700


HyukjinKwon commented on code in PR #36749:
URL: https://github.com/apache/spark/pull/36749#discussion_r887871266



##########
python/docs/source/user_guide/pandas_on_spark/supported_pandas_api.rst:
##########
@@ -17,34 +17,32 @@
 
 
 =====================
-Supported pandas APIs
+Supported pandas API
 =====================
 
 .. currentmodule:: pyspark.pandas
 
 The following table shows the pandas APIs that implemented or non-implemented 
from pandas API on
-Spark.
+Spark. Some pandas API do not implement full parameters, so the third column 
shows missing
+parameters for each API.
 
-Some pandas APIs do not implement full parameters, so the third column shows 
missing parameters for
-each API.
+* 'Y' in the second column means it's implemented including its whole 
parameter.
+* 'N' means it's not implemented yet.
+* 'P' means it's partially implemented with the missing of some parameters.
 
-'Y' in the second column means it's implemented including its whole parameter.
-'N' means it's not implemented yet.
-'P' means it's partially implemented with the missing of some parameters.
+All API in the list below computes the data with distributed execution except 
the ones that require
+the local execution by design. For example, `DataFrame.to_numpy() 
<https://spark.apache.org/docs/
+latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.to_numpy.html>`__
+requires to collect the data to the driver side.
 
 If there is non-implemented pandas API or parameter you want, you can create 
an `Apache Spark
-JIRA <https://issues.apache.org/jira/projects/SPARK/summary>`__ to request or 
to contribute by your
-own.
+JIRA <https://issues.apache.org/jira/projects/SPARK/summary>`__ to request or 
to contribute by
+your own.
 
-The API list is updated based on the `pandas 1.3 official API
-reference 
<https://pandas.pydata.org/pandas-docs/version/1.3/reference/index.html>`__.
+The API list is updated based on the `pandas 1.3 official API reference
+<https://pandas.pydata.org/pandas-docs/version/1.3/reference/index.html>`__.
 
-All implemented APIs listed here are distributed except the ones that requires 
the local
-computation by design. For example, `DataFrame.to_numpy() 
<https://spark.apache.org
-/docs/latest/api/python/reference/pyspark.pandas/api/pyspark.pandas.DataFrame.
-to_numpy.html>`__ requires to collect the data to the driver side.
-
-Supported DataFrame APIs
+DataFrame API
 ------------------------

Review Comment:
   ```suggestion
   -------------
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36749: [SPARK-39295][DOCS][PYTHON][3.3] Improve documentation of pandas API supported list

Reply via email to