Dzeri96 commented on issue #14424: URL: https://github.com/apache/iceberg/issues/14424#issuecomment-3675689775
I just found the time to go a bit deeper with this issue and I have some thoughts. First of all, my assumption on why the default namespace is hard-coded in the `SparkSessionCatalog` is because the delegating catalog is instantiated before it by Spark, so as far as I know, you cannot affect its creation with the ´default-namespace´ config key like you can when your are creating your own `SparkCatalog`. With this said, I think the `defaultNamespace()` method can just call the delegating catalog. If you change the default DB with `spark_catalog.defaultDatabase`, it will give you the name of this DB, which is semantically equivalent to a namespace. This is something that I would definitely fix if it's ok with you. What I'm not sure about, is if we want to use `catalog.defaultDatabse` as a fallback when configuring `SparkCatalog`. I think it makes sense because it's a standard spark config key. Going another level, maybe we can go the other way, and configure the `SparkSessionCatalog` with `default-namespace`. I guess we would need to hook into the spark startup somehow. This behavior is described by the current Iceberg docs. Let me know what you think is the best way to proceed. Based on that, I'll change the code and the docs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
