Tim Armstrong created SPARK-55073:
-------------------------------------

             Summary: EvalPythonUDTFExec captures SparkPlan and sends it to 
executor
                 Key: SPARK-55073
                 URL: https://issues.apache.org/jira/browse/SPARK-55073
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 4.0.0
            Reporter: Tim Armstrong


The `semanticEquals` call here captures the SparkPlan and sends it to the 
executor.,

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonUDTFExec.scala

I can easily fix this, I just need to hoist this logic out of the closure

```

      // flatten all the arguments
      val allInputs = new ArrayBuffer[Expression]
      val dataTypes = new ArrayBuffer[DataType]
      val argMetas = udtf.children.zip(
        udtf.tableArguments.getOrElse(Seq.fill(udtf.children.length)(false))
      ).map { case (e: Expression, isTableArg: Boolean) =>
        val (key, value) = e match {
          case NamedArgumentExpression(key, value) =>
            (Some(key), value)
          case _ =>
            (None, e)
        }
        if (allInputs.exists(_.semanticEquals(value))) {
          ArgumentMetadata(allInputs.indexWhere(_.semanticEquals(value)), key, 
isTableArg)
        } else {
          allInputs += value
          dataTypes += value.dataType
          ArgumentMetadata(allInputs.length - 1, key, isTableArg)
        }
      }.toArray
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to