rdblue commented on a change in pull request #396: Support DS read/write
without specifying namespace
URL: https://github.com/apache/incubator-iceberg/pull/396#discussion_r315798731
##########
File path:
spark/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java
##########
@@ -104,7 +104,9 @@ protected Table findTable(DataSourceOptions options,
Configuration conf) {
return tables.load(path.get());
} else {
HiveCatalog hiveCatalog = HiveCatalogs.loadCatalog(conf);
- TableIdentifier tableIdentifier = TableIdentifier.parse(path.get());
+ TableIdentifier tableIdentifier = path.get().contains(".") ?
+ TableIdentifier.parse(path.get()) :
+ TableIdentifier.of(lazySparkSession().catalog().currentDatabase(),
path.get());
Review comment:
I think the problem with this approach is that there is no guarantee that
Spark's session catalog uses the same namespace as the Iceberg catalog.
Spark just solved the problem for multiple catalogs:
https://github.com/apache/spark/pull/25368
Spark will now only keep a current database (namespace) for the current
catalog and will fill in the current namespace before passing the identifier to
a catalog. I think that we should go with that approach and let Spark fill in
session configuration like current database. Of course, that means we can't do
much to fix this through this path because Spark can't help fill in a namespace
when it doesn't know about the catalog because we are accessing the Iceberg
source directly by name.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]