nastra commented on code in PR #14398:
URL: https://github.com/apache/iceberg/pull/14398#discussion_r2694238435
##########
core/src/main/java/org/apache/iceberg/rest/RESTCatalogProperties.java:
##########
@@ -43,6 +45,14 @@ private RESTCatalogProperties() {}
public static final String REST_SCAN_PLAN_ID = "rest-scan-plan-id";
+ // Properties that control the behaviour of the table cache used for
freshness-aware table
+ // loading.
+ public static final String TABLE_CACHE_EXPIRE_AFTER_WRITE_MS =
"rest-table-cache-expire-after-ms";
Review Comment:
nit: maybe `rest-table-cache-expiration-interval-ms` to align with the
naming & semantics of properties in `CatalogProperties`
##########
core/src/main/java/org/apache/iceberg/rest/RESTCatalog.java:
##########
@@ -96,6 +96,10 @@ public void initialize(String name, Map<String, String>
props) {
sessionCatalog.initialize(name, props);
}
+ protected RESTSessionCatalog sessionCatalog() {
Review Comment:
this should probably have a `@VisibleForTesting` annotation, since it's only
being exposed for tests
##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -277,9 +316,44 @@ public void initialize(String name, Map<String, String>
unresolved) {
mergedProps,
RESTCatalogProperties.REST_SCAN_PLANNING_ENABLED,
RESTCatalogProperties.REST_SCAN_PLANNING_ENABLED_DEFAULT);
+
+ this.tableCache = tableCacheBuilder(mergedProps).build();
+
super.initialize(name, mergedProps);
}
+ protected Caffeine<Object, Object> tableCacheBuilder(Map<String, String>
props) {
+ long expireAfterWriteMS =
+ PropertyUtil.propertyAsLong(
+ props,
+ RESTCatalogProperties.TABLE_CACHE_EXPIRE_AFTER_WRITE_MS,
+ RESTCatalogProperties.TABLE_CACHE_EXPIRE_AFTER_WRITE_MS_DEFAULT);
+ Preconditions.checkArgument(
+ expireAfterWriteMS > 0, "Invalid expire after write: zero or
negative");
+
+ long numEntries =
+ PropertyUtil.propertyAsLong(
+ props,
+ RESTCatalogProperties.TABLE_CACHE_MAX_ENTRIES,
+ RESTCatalogProperties.TABLE_CACHE_MAX_ENTRIES_DEFAULT);
+ Preconditions.checkArgument(numEntries >= 0, "Invalid max entries:
negative");
+
+ Caffeine<Object, Object> builder =
Review Comment:
nit: variable is redundant
##########
core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java:
##########
@@ -3475,6 +3535,496 @@ public void
testLoadTableWithMissingMetadataFile(@TempDir Path tempDir) {
.hasMessageContaining("No in-memory file found for location: " +
metadataFileLocation);
}
+ @Test
+ public void testInvalidTableCacheParameters() {
Review Comment:
I would prefer if we just create a new test class at this point, similar to
what we did with `TestRESTScanPlanning`. The complexity in `TestRESTCatalog` is
already super high and moving everything related to freshness aware loading
into its own class helps to reduce that complexity and makes reviewing easier
##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -165,6 +172,38 @@ public class RESTSessionCatalog extends
BaseViewSessionCatalog
private Supplier<Map<String, String>> mutationHeaders = Map::of;
private String namespaceSeparator = null;
+ @VisibleForTesting
+ @Value.Immutable
+ abstract static class SessionIdTableId {
+ public abstract String sessionId();
+
+ public abstract TableIdentifier tableIdentifier();
+
+ public static SessionIdTableId of(String sessionId, TableIdentifier ident)
{
+ return ImmutableSessionIdTableId.builder()
+ .sessionId(sessionId)
+ .tableIdentifier(ident)
+ .build();
+ }
+ }
+
+ @VisibleForTesting
+ @Value.Immutable
+ abstract static class TableSupplierWithETag {
Review Comment:
this one and SessionIdTableId can both be an interface instead of an
abstract class
##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -470,38 +599,53 @@ public Table loadTable(SessionContext context,
TableIdentifier identifier) {
tableMetadata = response.tableMetadata();
}
+ List<Credential> credentials = response.credentials();
RESTClient tableClient = client.withAuthSession(tableSession);
+ Supplier<BaseTable> tableSupplier =
+ () ->
+ createTableSupplier(
+ finalIdentifier, tableMetadata, context, tableClient,
tableConf, credentials);
+
+ String eTag = responseHeaders.getOrDefault(HttpHeaders.ETAG, null);
+ if (eTag != null) {
+ tableCache.put(
+ SessionIdTableId.of(context.sessionId(), finalIdentifier),
+ TableSupplierWithETag.of(tableSupplier, eTag));
+ }
+
+ if (metadataType != null) {
+ return
MetadataTableUtils.createMetadataTableInstance(tableSupplier.get(),
metadataType);
+ }
+
+ return tableSupplier.get();
+ }
+
+ private BaseTable createTableSupplier(
Review Comment:
this isn't really creating the supplier but only the table
##########
core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java:
##########
@@ -165,6 +172,38 @@ public class RESTSessionCatalog extends
BaseViewSessionCatalog
private Supplier<Map<String, String>> mutationHeaders = Map::of;
private String namespaceSeparator = null;
+ @VisibleForTesting
+ @Value.Immutable
+ abstract static class SessionIdTableId {
Review Comment:
have you considered creating a separate `RESTTableCache` class which then
holds `SessionIdTableId` and `TableSupplierWithETag`? The reason I'm proposing
this is because this class is already quite large and it would help to manage
the complexity
##########
core/src/main/java/org/apache/iceberg/rest/RESTCatalogProperties.java:
##########
@@ -43,6 +45,14 @@ private RESTCatalogProperties() {}
public static final String REST_SCAN_PLAN_ID = "rest-scan-plan-id";
+ // Properties that control the behaviour of the table cache used for
freshness-aware table
+ // loading.
+ public static final String TABLE_CACHE_EXPIRE_AFTER_WRITE_MS =
"rest-table-cache-expire-after-ms";
Review Comment:
alternatively you might want to make sure that the property says
`expire-after-write` in order to clearly indicate what this controls
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]