Github user icexelloss commented on a diff in the pull request:
https://github.com/apache/spark/pull/19630#discussion_r151447943
--- Diff: python/pyspark/worker.py ---
@@ -89,6 +90,26 @@ def verify_result_length(*a):
return lambda *a: (verify_result_length(*a), arrow_return_type)
+def wrap_pandas_group_map_udf(f, return_type):
+ def wrapped(*series):
+ import pandas as pd
+
+ result = f(pd.concat(series, axis=1))
--- End diff --
series itself has a name attribute. `pd.concat` will use the name attribute
of series to be the column name in dataframe.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]