yuchenhuo commented on issue #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation URL: https://github.com/apache/spark/pull/26957#issuecomment-574957783 @rdblue @brkyvz Based on the discussion we have here, I think I understand a bit more about this. So the key problem we are trying to solve is that unlike many other V2Commands https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala#L152 `DataSourceV2Relation` only includes the table information which doesn't contain the catalog and namespaces info. Previously, I think I worried too much about whether `CatalogPlugin` or `Table` would implement name correctly. However, it seems like anyway based on the current design, the resolved path is generated through some customized parser e.g. `TableProvider`. So we have no choice but to use those kind of information. I think here are two final candidate solutions here: 1. add just `Seq[String]` to DataSourceV2Relation, so that we would not need to depend too much on `Identifier` and `CatalogPlugin` implementation. In terms of the implementation, we will be extracting the table name, namespaces and catalog name from `Identifier` and `CatalogPlugin` and put them into a Seq. 2. add both `Identifier` and `CatalogPlugin` to DataSourceV2Relation just in the same way as many other V2Commands. e.g. `CreateV2Table`, `AlterTable`.https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala#L152 I'm not sure if I understand it correctly, it seems like the Analyzer's job is to resolve the unresolved path Strings to actual table so I feels that it makes sense for the Analyzer to passed the resolved `Identifier` and `CatalogPlugin` to the following steps as anyway the Analyzer has to do the resolution. Therefore, I think the second approach seems better and it's also consistent with the other V2 interfaces?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
