[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #4758: Spark 3.2: Avoid reflection to load metadata tables in SparkTableUtil

GitBox Thu, 12 May 2022 10:57:41 -0700


aokolnychyi commented on code in PR #4758:
URL: https://github.com/apache/iceberg/pull/4758#discussion_r871665547



##########
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java:
##########
@@ -599,46 +602,15 @@ private static void deleteManifests(FileIO io, 
List<ManifestFile> manifests) {
         .run(item -> io.deleteFile(item.path()));
   }
 
-  // Attempt to use Spark3 Catalog resolution if available on the path
-  private static final DynMethods.UnboundMethod LOAD_METADATA_TABLE = 
DynMethods.builder("loadMetadataTable")
-      .hiddenImpl("org.apache.iceberg.spark.Spark3Util", SparkSession.class, 
Table.class, MetadataTableType.class)
-      .orNoop()
-      .build();
-
-  public static Dataset<Row> loadCatalogMetadataTable(SparkSession spark, 
Table table, MetadataTableType type) {
-    Preconditions.checkArgument(!LOAD_METADATA_TABLE.isNoop(), "Cannot find 
Spark3Util class but Spark3 is in use");
-    return LOAD_METADATA_TABLE.asStatic().invoke(spark, table, type);
-  }
-
   public static Dataset<Row> loadMetadataTable(SparkSession spark, Table 
table, MetadataTableType type) {
-    if (spark.version().startsWith("3")) {
-      // construct the metadata table instance directly
-      Dataset<Row> catalogMetadataTable = loadCatalogMetadataTable(spark, 
table, type);
-      if (catalogMetadataTable != null) {
-        return catalogMetadataTable;
-      }
-    }
-
-    String tableName = table.name();
-    String tableLocation = table.location();
-
-    DataFrameReader dataFrameReader = spark.read().format("iceberg");
-    if (tableName.contains("/")) {
-      // Hadoop Table or Metadata location passed, load without a catalog
-      return dataFrameReader.load(tableName + "#" + type);
-    }
+    return loadMetadataTable(spark, table, type, ImmutableMap.of());
+  }
 
-    // Catalog based resolution failed, our catalog may be a non-DatasourceV2 
Catalog
-    if (tableName.startsWith("hadoop.")) {
-      // Try loading by location as Hadoop table without Catalog
-      return dataFrameReader.load(tableLocation + "#" + type);
-    } else if (tableName.startsWith("hive")) {
-      // Try loading by name as a Hive table without Catalog
-      return dataFrameReader.load(tableName.replaceFirst("hive\\.", "") + "." 
+ type);
-    } else {
-      throw new IllegalArgumentException(String.format(
-          "Cannot find the metadata table for %s of type %s", tableName, 
type));
-    }
+  public static Dataset<Row> loadMetadataTable(SparkSession spark, Table 
table, MetadataTableType type,
+                                               Map<String, String> 
extraOptions) {
+    SparkTable metadataTable = new 
SparkTable(MetadataTableUtils.createMetadataTableInstance(table, type), false);

Review Comment:
   We had this code in that private method in `Spark3Util`. I just moved it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #4758: Spark 3.2: Avoid reflection to load metadata tables in SparkTableUtil

Reply via email to