PSaba opened a new issue, #4656:
URL: https://github.com/apache/polaris/issues/4656

   ### Describe the bug
   
   `GET /namespaces/{ns}/tables/{table}?snapshots=refs` always returns a 
LoadTableResult with metadata-location absent, even for tables that have 
committed snapshots and a known metadata file on storage. The same request with 
`?snapshots=all` correctly returns metadata-location for the same tables.
   
   ### To Reproduce
   
   1. Load any committed Iceberg table — metadata-location is present
   ```
     curl -H "Authorization: Bearer $TOKEN" \
       
"https://<polaris-host>/api/catalog/v1/<catalog>/namespaces/<ns>/tables/<table>"
 \
       | jq '."metadata-location"'
   ```
   returns `"s3://bucket/path/to/v3.metadata.json"`
   2. Same table, add ?snapshots=refs — metadata-location is absent  
   ```
   curl -H "Authorization: Bearer $TOKEN" \
       
"https://<polaris-host>/api/catalog/v1/<catalog>/namespaces/<ns>/tables/<table>?snapshots=refs"
 \
       | jq '."metadata-location"'
   ```
   returns `null`                                                               
                                                                                
                                                    
   
   
   ### Actual Behavior
   
   `metadata-location` is null / absent in the LoadTableResult whenever 
`?snapshots=refs` is used, regardless of the table's commit history.
   
   ### Expected Behavior
   
   `metadata-location `should be populated with the current metadata file path 
regardless of which `?snapshots` mode is requested. The `?snapshots` parameter 
controls which snapshots appear in `metadata.snapshots`; it should have no 
effect on the top-level `metadata-location` pointer, which reflects catalog 
state, not snapshot history.
   
   ### Additional context
   
   Root cause
   
   IcebergCatalogHandler.filterResponseToSnapshots() builds the filtered 
response like this:
   ```
   TableMetadata filteredMetadata =
         metadata.removeSnapshotsIf(s -> 
!referencedSnapshotIds.contains(s.snapshotId()));
   
     return LoadTableResponse.builder()
         .withTableMetadata(filteredMetadata)
         .addAllConfig(loadTableResponse.config())
         .addAllCredentials(loadTableResponse.credentials())
         .build();
   
   ```
     TableMetadata.removeSnapshotsIf() calls new 
Builder(this).removeSnapshots(toRemove).build(). Iceberg's 
TableMetadata.Builder enforces the invariant that metadataFileLocation must be 
null when the metadata object has pending changes (snapshot
     removals count as changes, since the resulting object hasn't been written 
to a file yet). So filteredMetadata.metadataFileLocation() is always null.
   
     LoadTableResponse.Builder.withTableMetadata() derives metadataLocation 
solely from tableMetadata.metadataFileLocation() — there is no independent 
setter for it on the builder. The original metadata-location from the 
unfiltered response is
     silently discarded.
   
     Proposed fix                                                               
                                                                                
                                      
   
     Add a withMetadataLocation(String) method to LoadTableResponse.Builder in 
Iceberg, then update filterResponseToSnapshots to explicitly preserve the 
original value:
   
   ```
     return LoadTableResponse.builder()
         .withTableMetadata(filteredMetadata)
         .withMetadataLocation(loadTableResponse.metadataLocation()) // 
preserve original
         .addAllConfig(loadTableResponse.config())
         .addAllCredentials(loadTableResponse.credentials())
         .build();
   ```
   
   
   
   
   ### System information
   
   Affected endpoint: GET 
/v1/{prefix}/namespaces/{namespace}/tables/{table}?snapshots=refs
   Object storage: S3
   Iceberg version: 1.10.x


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to