[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

via GitHub Mon, 17 Apr 2023 22:55:07 -0700


aokolnychyi commented on code in PR #7228:
URL: https://github.com/apache/iceberg/pull/7228#discussion_r1169526420



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws 
NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws 
NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws 
NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long 
timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Okay, I see `DropTableExec` implementation now. I agree about 
differentiating between table exists vs being able to load table metadata. We 
can change `tableExists` in Iceberg catalogs to verify if the metastore pointer 
exists rather that trying to load the table and then overload `tableExists` 
inside Spark catalogs.
   
   I feel we can still implement what I suggested 
[above](https://github.com/apache/iceberg/pull/7228/files#r1157829683) for the 
actual cleanup.
   
   @dpaani, that's unfortunate. One idea to overcome that issue is to throw a 
new exception type like `CorruptedMetadataException` in our Iceberg catalogs 
whenever the metadata file is missing but the table pointer exists. We can then 
catch this inside our Spark catalogs and return a special Spark table.
   
   Let me think more tomorrow. Any better ideas, @RussellSpitzer?



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java:
##########
@@ -134,28 +135,34 @@ public Identifier[] listTables(String[] namespace) throws 
NoSuchNamespaceExcepti
 
   @Override
   public Table loadTable(Identifier ident) throws NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident);
-    } catch (NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident);
-    }
+    return loadTableInternal(ident, null, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, String version) throws 
NoSuchTableException {
-    try {
-      return icebergCatalog.loadTable(ident, version);
-    } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, version);
-    }
+    return loadTableInternal(ident, version, null);
   }
 
   @Override
   public Table loadTable(Identifier ident, long timestamp) throws 
NoSuchTableException {
+    return loadTableInternal(ident, null, timestamp);
+  }
+
+  private Table loadTableInternal(Identifier ident, String version, Long 
timestamp)
+      throws NoSuchTableException {
     try {
-      return icebergCatalog.loadTable(ident, timestamp);
+      return loadTable(icebergCatalog, ident, version, timestamp);
     } catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
-      return getSessionCatalog().loadTable(ident, timestamp);
+      return loadTable(getSessionCatalog(), ident, version, timestamp);
+    } catch (org.apache.iceberg.exceptions.NotFoundException e) {
+      if (loadCatalogTableWhenMetadataNotFoundEnabled()) {

Review Comment:
   Okay, I see `DropTableExec` implementation now. I agree about 
differentiating between table exists vs being able to load table metadata. We 
can change `tableExists` in Iceberg catalogs to verify if the metastore pointer 
exists rather that trying to load the table and then overload `tableExists` 
inside Spark catalogs.
   
   I feel we can still implement what I suggested 
[above](https://github.com/apache/iceberg/pull/7228/files#r1157829683) for the 
actual cleanup.
   
   @dpaani, that's unfortunate. One idea to overcome that issue is to throw a 
new exception type like `CorruptedMetadataException` in our Iceberg catalogs 
whenever the metadata file is missing but the table pointer exists. We can then 
catch this inside our Spark catalogs and return a special dummy Spark table.
   
   Let me think more tomorrow. Any better ideas, @RussellSpitzer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7228: Fix for Drop SQL issue when attempting to drop an Iceberg table

Reply via email to