[
https://issues.apache.org/jira/browse/SPARK-54598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruifeng Zheng reassigned SPARK-54598:
-------------------------------------
Assignee: Yicong Huang
> Refactor UDF fetching logic out from invocation
> -----------------------------------------------
>
> Key: SPARK-54598
> URL: https://issues.apache.org/jira/browse/SPARK-54598
> Project: Spark
> Issue Type: Task
> Components: PySpark
> Affects Versions: 4.2.0
> Reporter: Yicong Huang
> Assignee: Yicong Huang
> Priority: Major
> Labels: pull-request-available
>
> The current implementation has redundant UDF reading logic scattered
> throughout `read_udfs()`:
> **Single UDF pattern** (repeated in multiple branches):
> {code:python}
> arg_offsets, f = read_single_udf(
> pickleSer, infile, eval_type, runner_conf, udf_index=0, profiler=profiler
> )
> parsed_offsets = extract_key_value_indexes(arg_offsets) # when needed
> {code}
> **Multiple UDFs pattern** (repeated in multiple branches):
> {code:python}
> udfs = []
> for i in range(num_udfs):
> udfs.append(
> read_single_udf(
> pickleSer, infile, eval_type, runner_conf, udf_index=i,
> profiler=profiler
> )
> )
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]