yuchenhuo commented on issue #26957: [SPARK-30314] Add identifier and catalog 
information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957#issuecomment-574957783
 
 
   @rdblue @brkyvz Based on the discussion we have here, I think I understand a 
bit more about this. 
   
   So the key problem we are trying to solve is that unlike many other 
V2Commands 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala#L152
 `DataSourceV2Relation` only includes the table information which doesn't 
contain the catalog and namespaces info.
   
   Previously, I think I worried too much about whether `CatalogPlugin` or 
`Table` would implement name correctly. However, it seems like anyway based on 
the current design, the resolved path is generated through some customized 
parser e.g. `TableProvider`. So we have no choice but to use those kind of 
information. 
   
   I think here are two final candidate solutions here:
   1. add just `Seq[String]` to DataSourceV2Relation, so that we would not need 
to depend too much on `Identifier` and `CatalogPlugin` implementation. In terms 
of the implementation, we will be extracting the table name, namespaces and 
catalog name from `Identifier` and `CatalogPlugin` and put them into a Seq.
   2. add both `Identifier` and `CatalogPlugin` to DataSourceV2Relation just in 
the same way as many other V2Commands. e.g. `CreateV2Table`, 
`AlterTable`.https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala#L152
   
   I'm not sure if I understand it correctly, it seems like the Analyzer's job 
is to resolve the unresolved path Strings to actual table so I feels that it 
makes sense for the Analyzer to passed the resolved `Identifier` and 
`CatalogPlugin` to the following steps as anyway the Analyzer has to do the 
resolution. Therefore, I think the second approach seems better and it's also 
consistent with the other V2 interfaces?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to