[ 
https://issues.apache.org/jira/browse/SPARK-38961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533699#comment-17533699
 ] 

Hyunwoo Park commented on SPARK-38961:
--------------------------------------

How about this way?




{code:python}

from inspect import getmembers, isclass, isfunction
import pandas as pd
from pyspark import pandas as ps

# automatically generated pyspark.pandas APIs
ps_classes = tuple(map(lambda x: x[0], getmembers(ps, isclass)))
for ps_class in ps_classes:
    for method, _ in getmembers(getattr(ps, ps_class), isfunction):
        print(f"{ps_class}.{method}")

# also it is possible to automatically create a missing list
common_classes = set(map(lambda x: x[0], getmembers(pd, isclass))) & \
                 set(map(lambda x: x[0], getmembers(ps, isclass)))
print(common_classes)
# {'Series', 'DataFrame', 'MultiIndex', 'DatetimeIndex', 'NamedAgg', 'Index', 
'Int64Index', 'TimedeltaIndex', 'CategoricalIndex', 'Float64Index'}

for _class in common_classes:
    not_implemented = set(
        map(lambda x: x[0], getmembers(getattr(pd, _class), isfunction))
    ) - set(
        map(lambda x: x[0], getmembers(getattr(ps, _class), isfunction))
    )

    print(f"class: {_class}")
    print(f"not_implemented: {not_implemented}")

{code}

> Enhance to automatically generate the pandas API support list
> -------------------------------------------------------------
>
>                 Key: SPARK-38961
>                 URL: https://issues.apache.org/jira/browse/SPARK-38961
>             Project: Spark
>          Issue Type: Test
>          Components: PySpark
>    Affects Versions: 3.4.0
>            Reporter: Haejoon Lee
>            Priority: Major
>
> Currently, the supported pandas API list is manually maintained, so it would 
> be better to make the list automatically generated to reduce the maintenance 
> cost.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to