Tim Armstrong created SPARK-55073:
-------------------------------------
Summary: EvalPythonUDTFExec captures SparkPlan and sends it to
executor
Key: SPARK-55073
URL: https://issues.apache.org/jira/browse/SPARK-55073
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 4.0.0
Reporter: Tim Armstrong
The `semanticEquals` call here captures the SparkPlan and sends it to the
executor.,
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonUDTFExec.scala
I can easily fix this, I just need to hoist this logic out of the closure
```
// flatten all the arguments
val allInputs = new ArrayBuffer[Expression]
val dataTypes = new ArrayBuffer[DataType]
val argMetas = udtf.children.zip(
udtf.tableArguments.getOrElse(Seq.fill(udtf.children.length)(false))
).map { case (e: Expression, isTableArg: Boolean) =>
val (key, value) = e match {
case NamedArgumentExpression(key, value) =>
(Some(key), value)
case _ =>
(None, e)
}
if (allInputs.exists(_.semanticEquals(value))) {
ArgumentMetadata(allInputs.indexWhere(_.semanticEquals(value)), key,
isTableArg)
} else {
allInputs += value
dataTypes += value.dataType
ArgumentMetadata(allInputs.length - 1, key, isTableArg)
}
}.toArray
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]