marton-bod commented on a change in pull request #3539:
URL: https://github.com/apache/iceberg/pull/3539#discussion_r748349154
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
runCreateAndReadTest(identifier, createSql,
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
}
+ @Test
+ public void testExternalTableToTableManagedInHiveCatalog() throws
IOException {
+ Assume.assumeTrue("Only relevant for HiveCatalog", testTableType ==
TestTables.TestTableType.HIVE_CATALOG);
+
+ TableIdentifier identifier = TableIdentifier.of("default", "customers");
+ testTables.createIcebergTable(shell.getHiveConf(), identifier.name(),
Review comment:
This will create a table with input/output format set to
`HiveIcebergInputFormat`/`HiveIcebergOutputFormat`. Just checking if that's
what you intended to do since the problem statement talks about finding a way
for Hive to read tables with `FileInputFormat`/`FileOutputFormat`
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
runCreateAndReadTest(identifier, createSql,
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
}
+ @Test
+ public void testExternalTableToTableManagedInHiveCatalog() throws
IOException {
+ Assume.assumeTrue("Only relevant for HiveCatalog", testTableType ==
TestTables.TestTableType.HIVE_CATALOG);
+
+ TableIdentifier identifier = TableIdentifier.of("default", "customers");
+ testTables.createIcebergTable(shell.getHiveConf(), identifier.name(),
+ HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, FileFormat.PARQUET,
+ Collections.emptyList());
+
+ Map<StructLike, List<Record>> data = Maps.newHashMap();
+ data.put(null, HiveIcebergStorageHandlerTestUtils.CUSTOMER_RECORDS);
+
+ String createSql = String.format("CREATE EXTERNAL TABLE
external_table_to_iceberg " +
+ "STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' " +
+ "TBLPROPERTIES('name'='%s')", identifier);
Review comment:
I think while using `name` is the most convenient, using something like
`alias_to_table=xxx` or `referenced_table=xxx` would be more descriptive to
users and would avoid leaking the internal `name` property. That would require
making code changes in StorageHandler, SerDe, etc. though. What do you think?
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerLocalScan.java
##########
@@ -311,6 +313,26 @@ public void
testCreateTableWithColumnSpecificationMultilevelPartitioned() throws
runCreateAndReadTest(identifier, createSql,
HiveIcebergStorageHandlerTestUtils.CUSTOMER_SCHEMA, spec, data);
}
+ @Test
+ public void testExternalTableToTableManagedInHiveCatalog() throws
IOException {
Review comment:
If we make these alias tables supported, I think it would make sense to
add a few more tests, and some scenarios to think about, e.g.:
- writing into the aliased table (unless it's intended to be read-only, but
I think with the current construct it's not)
- dropping the referenced table - the alias table becomes unreadable
- the supplied property at create table time points to a non-existing table
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]