EnricoMi commented on code in PR #39902:
URL: https://github.com/apache/spark/pull/39902#discussion_r1102777270
##########
python/pyspark/worker.py:
##########
@@ -208,6 +208,41 @@ def wrapped(left_key_series, left_value_series,
right_key_series, right_value_se
return lambda kl, vl, kr, vr: [(wrapped(kl, vl, kr, vr),
to_arrow_type(return_type))]
+def wrap_multi_cogrouped_map_pandas_udf(f, return_type, runner_conf, argspec):
+ def wrapped(key_series, value_series):
+ import pandas as pd
+
+ dfs = [pd.concat(series, axis=1) for series in value_series]
+
+ if runner_conf.get("pass_key") == "true":
Review Comment:
I think `pass_key` is not needed here, as we have `dfs` and `argspec.args`.
When `len(argspec.args) == len(dfs)`, no key is expected, when
`len(argspec.args) == len(dfs) + 1`, the key is expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]