[GitHub] [spark] yuchenhuo commented on issue #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation

GitBox Tue, 14 Jan 2020 17:52:09 -0800

yuchenhuo commented on issue #26957: [SPARK-30314] Add identifier and catalog 
information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957#issuecomment-574458182
 
 
   @rdblue There are two main reasons why I choose to split the catalog and 
table identifiers:
   1. If we are using a multi-part string, then we kind of need to have an 
implicit protocol that the leading string in the Seq is the catalog name and 
the rest is name spaces and table name. This would become extra tricky when we 
don't have one in certain cases. 
   2. Again for the `load(paths: String*)` case, multi-part identifier 
Seq[String] is just not capable to represent the case.
   
   Is there any particular reason why we don't want to import Identifier class 
in Analyzer code? I'm just feeling that enforce more explicit typing might be 
good for future extensibility, but I do agree that this increases the 
complexity and probably encodes duplicated information. I'm pretty new to this 
code so feel free to suggest the better way to do this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] yuchenhuo commented on issue #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation

Reply via email to