Neer393 commented on code in PR #6012:
URL: https://github.com/apache/hive/pull/6012#discussion_r2260522596
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/metastore/task/IcebergHouseKeeperService.java:
##########
@@ -50,7 +47,7 @@ public class IcebergHouseKeeperService implements
MetastoreTaskThread {
private ExecutorService deleteExecutorService = null;
// table cache to avoid making repeated requests for the same Iceberg tables
more than once per day
- private final Cache<TableName, Table> tableCache = Caffeine.newBuilder()
+ private final Cache<org.apache.hadoop.hive.metastore.api.Table, Table>
tableCache = Caffeine.newBuilder()
Review Comment:
This change is required as the main purpose of this JIRA was to reduce the
number of msc calls. If you look at the earlier implementation, you will see
the following definition for ```getIcebergTable()```
```
private Table getIcebergTable(TableName tableName, IMetaStoreClient msc) {
return tableCache.get(tableName, key -> {
LOG.debug("Getting iceberg table from metastore as it's not present in
table cache: {}", tableName);
GetTableRequest request = new GetTableRequest(tableName.getDb(),
tableName.getTable());
try {
return IcebergTableUtil.getTable(conf, msc.getTable(request));
} catch (TException e) {
throw new RuntimeException(e);
}
});
}
```
Here we looked in the cache for the table name and if not found, we created
the table object using ```msc.getTable(request)``` .
Hence keeping a cache mapping of tablename and IcebergTableObject was not
feasible as this would not reduce the msc calls.
The newer implementation makes the cache mapping to HMSTableAPIObject and
IcebergTableObject and hence even when the IcebergTableObject is not in cache,
we can directly create it using the HMSTableAPIObject and there is no need for
msc calls again.
```
private Table getIcebergTable(org.apache.hadoop.hive.metastore.api.Table
table) {
return tableCache.get(table, key -> {
LOG.debug("Getting iceberg table from metastore as it's not present in
table cache: {}", table.getTableName());
return IcebergTableUtil.getTable(conf, table);
});
}
```
**Hence this cache mapping change is necessary. Plus this is only consumed
by this file only and there are no other consumers to this so this won't break
anything**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]