rdblue commented on pull request #1783:
URL: https://github.com/apache/iceberg/pull/1783#issuecomment-741929560


   I think that the catalog registration here is good. I don't see a way around 
registering a catalog, so I think we should do it and also use that to maintain 
compatibility. So here's what I think behavior should be:
   
   1. Always register a default Iceberg catalog that is a HiveCatalog using the 
HMS URI from hive-site.xml, like what you've done.
   2. Use the `/` check for paths. If the table ref is a path, then load it 
from the default Iceberg catalog (any catalog works).
   3. If the table ref has a catalog, use that catalog (if it doesn't support 
Iceberg, that's fine because we can't guess or replace it)
   4. If the table ref does not have a catalog, use the current catalog
   5. If the current catalog is the session catalog and is not an 
`IcebergSessionCatalog`, then replace it with the default Iceberg catalog
   6. If the current catalog is not the session catalog, use it even if it 
doesn't support Iceberg
   
   That keeps behavior identical and also follows Spark's rules for 
identifiers. The only time that we use the default Iceberg catalog for a Hive 
identifier is if there is nothing new in Spark 3 that overrides the behavior to 
use a different catalog, and if the session catalog (equivalent to the default 
in 2.4) can't handle Iceberg.
   
   Optionally, paths could be loaded from the session catalog if it is an 
`IcebergSessionCatalog`.
   
   What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to