dpaani commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1152381961
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws
NoSuchNamespaceExcepti
@Override
public Table loadTable(Identifier ident) throws NoSuchTableException {
- try {
- return icebergCatalog.loadTable(ident);
- } catch (NoSuchTableException e) {
- return getSessionCatalog().loadTable(ident);
- }
+ return loadTableInternal(ident, null, null);
}
@Override
public Table loadTable(Identifier ident, String version) throws
NoSuchTableException {
- try {
- return icebergCatalog.loadTable(ident, version);
- } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
- return getSessionCatalog().loadTable(ident, version);
- }
+ return loadTableInternal(ident, version, null);
}
@Override
public Table loadTable(Identifier ident, long timestamp) throws
NoSuchTableException {
+ return loadTableInternal(ident, null, timestamp);
+ }
+
+ private Table loadTableInternal(Identifier ident, String version, Long
timestamp)
+ throws NoSuchTableException {
try {
- return icebergCatalog.loadTable(ident, timestamp);
+ return loadTable(icebergCatalog, ident, version, timestamp);
} catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
- return getSessionCatalog().loadTable(ident, timestamp);
+ throw e;
+ } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+ if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {
Review Comment:
Sure @RussellSpitzer .
1. Spark Catalog - Do we need to fix this there? --> Only
SparkSessionCatalog has delegate catalog. SparkCatalog only works with iceberg
table. In this case, it is not a valid iceberg table since metadata folder does
not exist.
2. Drop table even if metadata cannot be loaded. Maybe a Spark Conf? -->
Having spark config is good idea. By default, we can keep this config as false.
3. We can maybe do this by checking for MetadataLocation? --> This is also
safe thing to do. I thought of checking this only if spark conf is turned on.
With my proposed change.. it will not break existing functionality and will
not be enabled by default. If spark conf set to true, and table contains
metadata_location property ( valid iceberg table earlier) and then only load
from metastore and delete it. Once metadata does not exist.. this table cant
behave as Iceberg table anymore and safer to delete if user requested
```
private Table loadTableInternal(Identifier ident, String version, Long
timestamp)
throws NoSuchTableException {
try {
return loadTable(icebergCatalog, ident, version, timestamp);
} catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
throw e;
} catch (org.apache.iceberg.exceptions.NotFoundException e) {
**if (loadFromSessionCatalogOnLocationNotFoundEnabled()) {** // check
config is set
Table table = loadTable(getSessionCatalog(), ident, version,
timestamp);
if (table.properties() != null
&&
**table.properties().containsKey(TableProperties.METADATA_LOCATION)**) { //
check iceberg table
return table;
}
}
throw e;
}
}
private boolean loadFromSessionCatalogOnLocationNotFoundEnabled() {
return Boolean.parseBoolean(
SparkSession.active()
.conf()
.get(
SparkSQLProperties.LOAD_FROM_SESSION_CATALOG_ON_LOCATION_NOT_FOUND_ENABLED,
SparkSQLProperties
.LOAD_FROM_SESSION_CATALOG_ON_LOCATION_NOT_FOUND_ENABLED_DEFAULT));
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]