rdblue commented on a change in pull request #28027:
URL: https://github.com/apache/spark/pull/28027#discussion_r512961848
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
##########
@@ -880,6 +880,12 @@ case class SubqueryAlias(
val qualifierList = identifier.qualifier :+ alias
child.output.map(_.withQualifier(qualifierList))
}
+
+ override def metadataOutput: Seq[Attribute] = {
+ val qualifierList = identifier.qualifier :+ alias
+ child.metadataOutput.map(_.withQualifier(qualifierList))
+ }
Review comment:
They are _eventually_ part of the output, but they can't be at first
because `*` expansion uses all of `output`. If we added them immediately, we
would get metadata columns in a `select *`.
Instead, we add the metadata columns to this and then update column
resolution to look up columns here. The result is that we can resolve
everything just like normal, including `*`, but the columns are missing from
output. Then the new analyzer rule adds the columns to the output if they are
resolved, but missing. Since the parent node is already resolved, we know that
this is safe and happens after `*` expansion.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]