[ 
https://issues.apache.org/jira/browse/SPARK-34544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17298016#comment-17298016
 ] 

Rafal Wojdyla commented on SPARK-34544:
---------------------------------------

[~zero323] yea, I see, in nature that suggestion is similar to having asserts, 
or having our own `toPandas(sparkDF)` method with return annotation ` -> 
pd.DataFrame` (which would still be effectively Any, but at least future 
proof). Which is btw the workaround I'm currently most in favour of.

>  I can definitely check viability of 3rd stubs by then.

That would be amazing! Thank you.

> pyspark toPandas() should return pd.DataFrame
> ---------------------------------------------
>
>                 Key: SPARK-34544
>                 URL: https://issues.apache.org/jira/browse/SPARK-34544
>             Project: Spark
>          Issue Type: Sub-task
>          Components: PySpark
>    Affects Versions: 3.1.1
>            Reporter: Rafal Wojdyla
>            Assignee: Maciej Szymkiewicz
>            Priority: Major
>
> Right now {{toPandas()}} returns {{DataFrameLike}}, which is an incomplete 
> "view" of pandas {{DataFrame}}. Which leads to cases like mypy reporting that 
> certain pandas methods are not present in {{DataFrameLike}}, even tho those 
> methods are valid methods on pandas {{DataFrame}}, which is the actual type 
> of the object. This requires type ignore comments or asserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to