rdblue commented on a change in pull request #26957: [SPARK-30314] Add 
identifier and catalog information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957#discussion_r366634300
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
 ##########
 @@ -32,12 +32,19 @@ import org.apache.spark.util.Utils
  * A logical plan representing a data source v2 table.
  *
  * @param table   The table that this relation represents.
+ * @param output the output attributes of this relation
+ * @param catalogIdentifier the string identifier for the catalog. None if no 
catalog is specified
+ * @param identifiers the identifiers for the v2 relation. For multipath 
dataframe, there could be
+ *                    more than one identifier or Nil if a V2 relation is 
instantiated using
+ *                    options
  * @param options The options for this table operation. It's used to create 
fresh [[ScanBuilder]]
  *                and [[WriteBuilder]].
  */
 case class DataSourceV2Relation(
     table: Table,
     output: Seq[AttributeReference],
+    catalogIdentifier: Option[String],
+    identifiers: Seq[Identifier],
 
 Review comment:
   I don't think it is a good idea to have multiple identifiers here. DSv2 
doesn't yet cover how file-based tables should work and I think we need a 
design document for them. Adding multiple identifiers here in support of 
something that has undefined behavior seems premature.
   
   Design and behavior of path-based identifiers aside, a table should use one 
and only one identifier. When path-based tables are supported, I expect them to 
use a single `Identifier` with possibly more than one path embedded in it, like 
we do with the `paths` key.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to