[GitHub] [iceberg] marton-bod commented on a change in pull request #3539: Hive: Allow to create external table to access the iceberg table managed in hive catalog

GitBox Fri, 12 Nov 2021 07:22:34 -0800


marton-bod commented on a change in pull request #3539:
URL: https://github.com/apache/iceberg/pull/3539#discussion_r748349154




##########
File path: 
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void 
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
     runCreateAndReadTest(identifier, createSql, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
   }
 
+  @Test
+  public void testExternalTableToTableManagedInHiveCatalog() throws 
IOException {
+    Assume.assumeTrue("Only relevant for HiveCatalog", testTableType == 
TestTables.TestTableType.HIVE_CATALOG);
+
+    TableIdentifier identifier = TableIdentifier.of("default", "customers");
+    testTables.createIcebergTable(shell.getHiveConf(), identifier.name(),

Review comment:
       This will create a table with input/output format set to 
`HiveIcebergInputFormat`/`HiveIcebergOutputFormat`. Just checking if that's 
what you intended to do since the problem statement talks about finding a way 
for Hive to read tables with `FileInputFormat`/`FileOutputFormat`

##########
File path: 
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void 
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
     runCreateAndReadTest(identifier, createSql, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
   }
 
+  @Test
+  public void testExternalTableToTableManagedInHiveCatalog() throws 
IOException {
+    Assume.assumeTrue("Only relevant for HiveCatalog", testTableType == 
TestTables.TestTableType.HIVE_CATALOG);
+
+    TableIdentifier identifier = TableIdentifier.of("default", "customers");
+    testTables.createIcebergTable(shell.getHiveConf(), identifier.name(),
+        HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, FileFormat.PARQUET,
+        Collections.emptyList());
+
+    Map<StructLike, List<Record>> data = Maps.newHashMap();
+    data.put(null, HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS);
+
+    String createSql = String.format("CREATE EXTERNAL TABLE 
external_table_to_iceberg " +
+        "STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' " +
+        "TBLPROPERTIES('name'='%s')", identifier);

Review comment:
       I think while using `name` is the most convenient, using something like 
`alias_to_table=xxx` or `referenced_table=xxx` would be more descriptive to 
users and would avoid leaking the internal `name` property. That would require 
making code changes in StorageHandler, SerDe, etc. though. What do you think?

##########
File path: 
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void 
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
     runCreateAndReadTest(identifier, createSql, 
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
   }
 
+  @Test
+  public void testExternalTableToTableManagedInHiveCatalog() throws 
IOException {

Review comment:
       If we make these alias tables supported, I think it would make sense to 
add a few more tests, and some scenarios to think about, e.g.:
   - writing into the aliased table (unless it's intended to be read-only, but 
I think with the current construct it's not)
   - dropping the referenced table - the alias table becomes unreadable 
   - the supplied property at create table time points to a non-existing table




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] marton-bod commented on a change in pull request #3539: Hive: Allow to create external table to access the iceberg table managed in hive catalog

Reply via email to