[GitHub] [spark] bart-samwel commented on a change in pull request #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2

GitBox Wed, 04 Nov 2020 11:33:20 -0800


bart-samwel commented on a change in pull request #28027:
URL: https://github.com/apache/spark/pull/28027#discussion_r517582852




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
##########
@@ -48,6 +48,15 @@ case class DataSourceV2Relation(
 
   import DataSourceV2Implicits._
 
+  override lazy val metadataOutput: Seq[AttributeReference] = table match {
+    case hasMeta: SupportsMetadataColumns =>
+      val attrs = hasMeta.metadataColumns
+      val outputNames = outputSet.map(_.name).toSet
+      attrs.filterNot(col => outputNames.contains(col.name)).toAttributes

Review comment:
       An alternative would be to make the column names use a recognizable 
pattern that we can forbid as column names. E.g. Snowflake uses `metadata$foo` 
for these columns, which is a lot more distinguishable than just a leading 
underscore. Hence, we could forbid using these as normal identifiers (basically 
check for a `metadata$` prefix) to prevent the round-tripping-via-file issues, 
but other than that the accesses would still look like normal column references.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] bart-samwel commented on a change in pull request #28027: [SPARK-31255][SQL] Add SupportsMetadataColumns to DSv2

Reply via email to