EnricoMi commented on code in PR #39902:
URL: https://github.com/apache/spark/pull/39902#discussion_r1103004299
##########
python/pyspark/worker.py:
##########
@@ -208,6 +208,41 @@ def wrapped(left_key_series, left_value_series,
right_key_series, right_value_se
return lambda kl, vl, kr, vr: [(wrapped(kl, vl, kr, vr),
to_arrow_type(return_type))]
+def wrap_multi_cogrouped_map_pandas_udf(f, return_type, runner_conf, argspec):
+ def wrapped(key_series, value_series):
+ import pandas as pd
+
+ dfs = [pd.concat(series, axis=1) for series in value_series]
+
+ if runner_conf.get("pass_key") == "true":
Review Comment:
With "var-args" you mean
```python
def func(*pdfs: pd.DataFrame) -> pd.DataFrame
```
```python
def func_with_key(key, *pdfs: pd.DataFrame) -> pd.DataFrame
```
Then, `len(argspec.args)` will be `0` for `func` and `1` for
`func_with_key` and `argspec.args` will not be `None` in both cases. So
restricting var-args cases to above signatures (not allowing `def func(pdf:
pd.DataFrame, *pdfs: pd.DataFrame) -> pd.DataFrame`) should make `pass_key`
redundant.
Can you give an example what you mean with "explicitness"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]