yuchenhuo commented on a change in pull request #26957: [SPARK-30314] Add
identifier and catalog information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957#discussion_r366659477
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
##########
@@ -32,12 +32,19 @@ import org.apache.spark.util.Utils
* A logical plan representing a data source v2 table.
*
* @param table The table that this relation represents.
+ * @param output the output attributes of this relation
+ * @param catalogIdentifier the string identifier for the catalog. None if no
catalog is specified
+ * @param identifiers the identifiers for the v2 relation. For multipath
dataframe, there could be
+ * more than one identifier or Nil if a V2 relation is
instantiated using
+ * options
* @param options The options for this table operation. It's used to create
fresh [[ScanBuilder]]
* and [[WriteBuilder]].
*/
case class DataSourceV2Relation(
table: Table,
output: Seq[AttributeReference],
+ catalogIdentifier: Option[String],
+ identifiers: Seq[Identifier],
Review comment:
I see. If the specification is that there should be one and only one
identifier, shall I just define it as `identifier: Identifier` instead of
`identifier: Option[Identifier]`? The tricky part is still in `load(paths:
String*)`, I might need to use some placeholder or `null` if we choose to not
use Option. What do you guys think? cc @brkyvz
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]