beobest2 commented on PR #36895: URL: https://github.com/apache/spark/pull/36895#issuecomment-1158305889
@HyukjinKwon Modified to automatically generate documentation for only newly declared or overridden functions in its own class. Most broken links have been fixed. However, some functions are not linked. The cases are divided as follows. For Cases A, B, and C, it seems that we can add documents in pyspark.pandas. In case D, it seems necessary to separately check the list declared in the code but not supported by pyspark.pandas, or to remove the declaration in the code. - Case A: There is only one pyspark.pandas document for the same function with a different name, (pandas document exists) ex> divide (=div), multiply(=mul), subtract(=sub) ``` DataFrame - divide - multiply - subtract Series - divde - multiply - subtract ``` - Case B: pyspark.pandas document does not exist (pandas docs exist) ``` Index - get_level_values - holds_integer - is_type_compatible MultiIndex - get_level_values Expanding - kurt - skew - std - var Rolling - kurt - skew - std - var ``` - Case C: Documentation does not exist (even in pandas) ``` MultiIndex - drop_duplicates GroupBy - expanding - pad - rolling ``` - Case D: Not supported by pandas, but included because it is declared in the code ex> https://github.com/apache/spark/blob/master/python/pyspark/pandas/groupby.py#L3539 ``` Index - sort SeriesGroupBy - agg - aggregate ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
