FANNG1 commented on PR #10498:
URL: https://github.com/apache/gravitino/pull/10498#issuecomment-4118301045

   I see two issues with the current approach:
   
   1. The ETag returned by `CreateTableResponse` is not reusable for the 
default `loadTable` request.
   The Iceberg REST spec says the client may send the ETag received from 
`CreateTableResponse` or `LoadTableResponse` in `If-None-Match`. In this PR, 
`createTable` returns an ETag derived from `metadataLocation`, while 
`loadTable` compares against an ETag derived from `metadataLocation + 
snapshots`. For the default `snapshots=all` path, those values differ, so a 
client that follows the spec and reuses the ETag from create will not get `304 
Not Modified` on a subsequent unchanged `loadTable`.
   
   2. We may be leaving performance on the table by always calling `loadTable` 
before checking the ETag.
   If the backend catalog can expose the current `metadataLocation` cheaply, 
for example via something like `SupportsMetadataLocation`, we could 
compute/compare the ETag first and return `304` without loading full table 
metadata. That would make the optimization more effective for freshness-aware 
reads. This should stay as an optional fast path though: if the catalog does 
not support it, the server can fall back to the current `loadTable` flow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to