[GitHub] [spark] linar-jether commented on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

GitBox Thu, 10 Sep 2020 22:09:18 -0700


linar-jether commented on pull request #29719:
URL: https://github.com/apache/spark/pull/29719#issuecomment-690876526



   Thank you @HyukjinKwon, issue is that this only applies to dataframes, this 
means that only spark supported types can be input to `.mapInPandas`. 
   so this does not cover use cases such as: `RDD[python_object] -> 
obj_to_pandas_df() -> create spark DF`, or as a method to read an RDD of pickle 
files and convert them to a spark DF.
   
   I believe this can enable more seamless integration with python packages 
that do not natively support spark.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] linar-jether commented on pull request #29719: [SPARK-32846][SQL][PYTHON] Support createDataFrame from an RDD of pd.DataFrames

Reply via email to