Github user icexelloss commented on a diff in the pull request:
https://github.com/apache/spark/pull/20295#discussion_r172249968
--- Diff: python/pyspark/worker.py ---
@@ -149,18 +156,30 @@ def read_udfs(pickleSer, infile, eval_type):
num_udfs = read_int(infile)
udfs = {}
call_udf = []
- for i in range(num_udfs):
+ mapper_str = ""
+ if eval_type == PythonEvalType.SQL_GROUPED_MAP_PANDAS_UDF:
+ # Create function like this:
+ # lambda a: f([a[0]], [a[0], a[1]])
+ assert num_udfs == 1
--- End diff --
Added. Hopefully it's clear enough?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]