[ https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292989#comment-17292989 ]
Maciej Szymkiewicz commented on SPARK-34544: -------------------------------------------- Overall: * {{DataFrameLike}} should be replaced with {{pandas.core.frame.DataFrame}} once we have stable source of annotations form pandas-dev. It was always the intention. * While I am not enthusiastic about keeping up with Pandas API changes, we could update the protocol by re-exporting Pandas annotation with stubgen. * Alternatively we can try to use third party annotations and provide setup guide for the users. * I'd be against removing {{DataFrameLike}} without having working alternative in place. It won't make end user experience better. > pyspark toPandas() should return pd.DataFrame > --------------------------------------------- > > Key: SPARK-34544 > URL: https://issues.apache.org/jira/browse/SPARK-34544 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.1.1 > Reporter: Rafal Wojdyla > Assignee: Maciej Szymkiewicz > Priority: Major > > Right now {{toPandas()}} returns {{DataFrameLike}}, which is an incomplete > "view" of pandas {{DataFrame}}. Which leads to cases like mypy reporting that > certain pandas methods are not present in {{DataFrameLike}}, even tho those > methods are valid methods on pandas {{DataFrame}}, which is the actual type > of the object. This requires type ignore comments or asserts. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org