rdblue commented on a change in pull request #1495:
URL: https://github.com/apache/iceberg/pull/1495#discussion_r497839612
##########
File path:
mr/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandlerBaseTest.java
##########
@@ -157,6 +177,282 @@ public void testJoinTables() throws IOException {
Assert.assertArrayEquals(new Object[] {1L, "Bob", 102L, 33.33d},
rows.get(2));
}
+ @Test
+ public void testCreateDropTable() throws TException, IOException {
+ // We need the location for HadoopTable based tests only
+ String location = locationForCreateTable(temp.getRoot().getPath(),
"customers");
+ shell.executeStatement("CREATE EXTERNAL TABLE customers " +
+ "STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' " +
+ (location != null ? "LOCATION '" + location + "' " : "") +
+ "TBLPROPERTIES ('" + InputFormatConfig.TABLE_SCHEMA + "'='" +
SchemaParser.toJson(CUSTOMER_SCHEMA) + "', " +
+ "'" + InputFormatConfig.PARTITION_SPEC + "'='" +
PartitionSpecParser.toJson(IDENTITY_SPEC) + "', " +
+ "'dummy'='test')");
+
+ Properties properties = new Properties();
+ properties.put(Catalogs.NAME, TableIdentifier.of("default",
"customers").toString());
+ if (location != null) {
+ properties.put(Catalogs.LOCATION, location);
+ }
+
+ // Check the Iceberg table data
+ org.apache.iceberg.Table icebergTable =
Catalogs.loadTable(shell.getHiveConf(), properties);
+ Assert.assertEquals(SchemaParser.toJson(CUSTOMER_SCHEMA),
SchemaParser.toJson(icebergTable.schema()));
+ Assert.assertEquals(PartitionSpecParser.toJson(IDENTITY_SPEC),
PartitionSpecParser.toJson(icebergTable.spec()));
+ Assert.assertEquals(Collections.singletonMap("dummy", "test"),
icebergTable.properties());
+
+ // Check the HMS table parameters
+ IMetaStoreClient client = null;
+ org.apache.hadoop.hive.metastore.api.Table hmsTable;
Review comment:
We might want to use a shared `HiveClientPool` for this. We've had
problems in the past where too many clients led to the HMS becoming
unresponsive in tests. It's really annoying and makes tests flaky. Sharing a
pool across all tests fixes the problem, and makes us more confident that if we
hit a connection issue, it is probably in prod code and not test.
I think it would also make the test cases smaller because you wouldn't need
try/finally blocks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]