FANNG1 commented on PR #10675:
URL: https://github.com/apache/gravitino/pull/10675#issuecomment-4205268980

   @jerryshao @royi My biggest concern is still the `atomic` semantics.
   
   The current two-phase `validate-then-commit` approach is definitely better 
than directly committing table by table, because it avoids cases where some 
requirements already fail before any commit starts. But once it enters the 
commit phase, it is still only `best-effort`, not a truly atomic multi-table 
commit. If one table commit succeeds and a later one fails, we can still end up 
with a partial-success state.
   
   So from my perspective, real atomicity has to come from the underlying 
catalog/metastore capability, not just from the REST layer. For Gravitino-owned 
catalogs, a stronger solution would likely require something like staging 
multi-table metadata changes first, then switching metadata pointers together 
with a final atomic/CAS-style step.
   
   Polaris seems to go a bit further in this direction, but it is still 
important to be precise about the semantics. Its `commitTransaction` is not a 
full distributed 2PC either. As I understand it, Polaris:
   
   - checks authorization for each table in the request
   - groups and validates updates per table
   - stages metastore entity changes in a transaction workspace
   - only at the end performs a batch CAS to switch multiple metadata pointers 
together
   
   That gives stronger semantics for Polaris-managed metadata pointers, but it 
is still not fully global atomic commit, since metadata files may already have 
been written before the final CAS, and failures can still leave unreferenced 
metadata behind.
   
   Would like to get your thoughts: should we first merge a `best-effort` 
implementation like this, with the semantics clearly documented, or should we 
wait until we have stronger catalog-level support before exposing this endpoint?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to