Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158436309 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala --- @@ -59,13 +61,21 @@ case class GenerateExec( generator: Generator, join: Boolean, outer: Boolean, + omitGeneratorChild: Boolean, generatorOutput: Seq[Attribute], child: SparkPlan) extends UnaryExecNode with CodegenSupport { + private def projectedChildOutput = generator match { + case g: UnaryExpression if omitGeneratorChild => --- End diff -- why limit to `UnaryExpression`? Think about if we have an array concat function in the future, and when we do `explode(array_concat(col1, col2))`, we should be able to omit both `col1` and `col2`. I'd like to add a `omitGeneratorReferences` parameter, and here can be simplified to ``` private def requiredChildOutput = if (omitGeneratorReferences) { val generatorReferences = generator.references child.output.filterNot(generatorReferences.contains) } else { child.output } ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org