gaborkaszab commented on code in PR #14166: URL: https://github.com/apache/iceberg/pull/14166#discussion_r2374807470
########## core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java: ########## @@ -887,6 +887,103 @@ public void testTableCredential(String oauth2ServerUri) { oauth2ServerUri); } + @Test + public void testTableRefreshLoadsRefsOnly() { + RESTCatalogAdapter adapter = Mockito.spy(new RESTCatalogAdapter(backendCatalog)); + + RESTCatalog catalog = + new RESTCatalog(SessionCatalog.SessionContext.createEmpty(), (config) -> adapter); + catalog.initialize( + "test", + ImmutableMap.of( + CatalogProperties.URI, + "ignored", + CatalogProperties.FILE_IO_IMPL, + "org.apache.iceberg.inmemory.InMemoryFileIO", + // default loading to refs only + RESTCatalogProperties.SNAPSHOT_LOADING_MODE, + SnapshotMode.REFS.name())); + + if (requiresNamespaceCreate()) { + catalog.createNamespace(TABLE.namespace()); + } + + // Create a table with multiple snapshots + Table table = catalog.createTable(TABLE, SCHEMA); + table + .newFastAppend() + .appendFile( + DataFiles.builder(PartitionSpec.unpartitioned()) + .withPath("/path/to/data-a.parquet") + .withFileSizeInBytes(10) + .withRecordCount(2) + .build()) + .commit(); + + table + .newFastAppend() + .appendFile( + DataFiles.builder(PartitionSpec.unpartitioned()) + .withPath("/path/to/data-b.parquet") + .withFileSizeInBytes(10) + .withRecordCount(2) + .build()) + .commit(); + + ResourcePaths paths = ResourcePaths.forCatalogProperties(Maps.newHashMap()); + + // Respond with only referenced snapshots + Answer<?> refsAnswer = + invocation -> { + LoadTableResponse originalResponse = (LoadTableResponse) invocation.callRealMethod(); + TableMetadata refsMetadata = + TableMetadata.buildFrom(originalResponse.tableMetadata()) + .suppressHistoricalSnapshots() + .build(); + + // don't call snapshots() directly as that would cause to load all snapshots. Instead, + // make sure the snapshots field holds exactly 1 snapshot + assertThat(refsMetadata) + .extracting("snapshots") + .asInstanceOf(InstanceOfAssertFactories.list(Snapshot.class)) + .hasSize(1); + + return LoadTableResponse.builder() + .withTableMetadata(refsMetadata) + .addAllConfig(originalResponse.config()) + .build(); + }; + + Mockito.doAnswer(refsAnswer) + .when(adapter) + .execute( + reqMatcher(HTTPMethod.GET, paths.table(TABLE), Map.of(), Map.of("snapshots", "refs")), + eq(LoadTableResponse.class), + any(), + any()); + + Table refsTable = catalog.loadTable(TABLE); + refsTable.refresh(); Review Comment: Partially scratch what I wrote: there is not much sense to do partial snapshot loading for createTable() since there won't be a list of snapshots anyway. But for subsequent refreshes() or commits() we could still do the partial loading. For this the adjustment for the RESTTableOperations is needed. In that thread I linked above, @nastra said that partial snapshot loading could make sense for RESTTableOperations.refresh() and commit() but only in case there were no snapshots() call since the creation of the ops, or in other works in case current.snapshotsLoaded is false. This flag is not exposed now, but in case we exposed it, then there might be no need to pass SnapshotMode to the RESTTableOps, because snapshptsLoaded can only be false in case REFS were used for loading the table so would be somewhat redundant. WDYT @nastra ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org