I'm looking for a mapPartition(pandas_udf) for a pyspark.Dataframe. ``` @pandas_udf(df.schema, PandasUDFType.MAP) def do_nothing(pandas_df): return pandas_df
new_df = df.mapPartition(do_nothing) ``` pandas_udf only support scala or GROUPED_MAP. Why not support just Map? On Thu, Mar 7, 2019 at 2:57 PM Sean Owen <sro...@gmail.com> wrote: > Are you looking for @pandas_udf in Python? Or just mapPartition? Those > exist already > > On Thu, Mar 7, 2019, 1:43 PM peng yu <yupb...@gmail.com> wrote: > >> There is a nice map_partition function in R `dapply`. so that user can >> pass a row to udf. >> >> I'm wondering why we don't have that in python? >> >> I'm trying to have a map_partition function with pandas_udf supported >> >> thanks! >> >