bart-samwel commented on a change in pull request #28027:
URL: https://github.com/apache/spark/pull/28027#discussion_r517582852
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
##########
@@ -48,6 +48,15 @@ case class DataSourceV2Relation(
import DataSourceV2Implicits._
+ override lazy val metadataOutput: Seq[AttributeReference] = table match {
+ case hasMeta: SupportsMetadataColumns =>
+ val attrs = hasMeta.metadataColumns
+ val outputNames = outputSet.map(_.name).toSet
+ attrs.filterNot(col => outputNames.contains(col.name)).toAttributes
Review comment:
An alternative would be to make the column names use a recognizable
pattern that we can forbid as column names. E.g. Snowflake uses `metadata$foo`
for these columns, which is a lot more distinguishable than just a leading
underscore. Hence, we could forbid using these as normal identifiers (basically
check for a `metadata$` prefix) to prevent the round-tripping-via-file issues,
but other than that the accesses would still look like normal column references.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]