dimas-b commented on code in PR #3360:
URL: https://github.com/apache/polaris/pull/3360#discussion_r2665991708
##########
runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java:
##########
@@ -1042,57 +1043,71 @@ public void commitTransaction(CommitTransactionRequest
commitTransactionRequest)
new TransactionWorkspaceMetaStoreManager(diagnostics,
metaStoreManager);
((IcebergCatalog)
baseCatalog).setMetaStoreManager(transactionMetaStoreManager);
- commitTransactionRequest.tableChanges().stream()
- .forEach(
- change -> {
- Table table = baseCatalog.loadTable(change.identifier());
- if (!(table instanceof BaseTable baseTable)) {
- throw new IllegalStateException(
- "Cannot wrap catalog that does not produce BaseTable");
- }
- if (isCreate(change)) {
- throw new BadRequestException(
- "Unsupported operation: commitTranaction with
updateForStagedCreate: %s",
- change);
- }
+ // Group all changes by table identifier to handle them atomically
+ // This prevents conflicts when multiple changes target the same table
entity
+ // LinkedHashMap preserves insertion order for deterministic processing
+ Map<TableIdentifier, List<UpdateTableRequest>> changesByTable = new
LinkedHashMap<>();
+ for (UpdateTableRequest change : commitTransactionRequest.tableChanges()) {
+ if (isCreate(change)) {
+ throw new BadRequestException(
+ "Unsupported operation: commitTranaction with
updateForStagedCreate: %s", change);
+ }
+ changesByTable.computeIfAbsent(change.identifier(), k -> new
ArrayList<>()).add(change);
+ }
- TableOperations tableOps = baseTable.operations();
- TableMetadata currentMetadata = tableOps.current();
-
- // Validate requirements; any CommitFailedExceptions will fail
the overall request
- change.requirements().forEach(requirement ->
requirement.validate(currentMetadata));
-
- // Apply changes
- TableMetadata.Builder metadataBuilder =
TableMetadata.buildFrom(currentMetadata);
- change.updates().stream()
- .forEach(
- singleUpdate -> {
- // Note: If location-overlap checking is refactored to
be atomic, we could
- // support validation within a single multi-table
transaction as well, but
- // will need to update the
TransactionWorkspaceMetaStoreManager to better
- // expose the concept of being able to read
uncommitted updates.
- if (singleUpdate instanceof MetadataUpdate.SetLocation
setLocation) {
- if
(!currentMetadata.location().equals(setLocation.location())
- && !realmConfig.getConfig(
-
FeatureConfiguration.ALLOW_NAMESPACE_LOCATION_OVERLAP)) {
- throw new BadRequestException(
- "Unsupported operation: commitTransaction
containing SetLocation"
- + " for table '%s' and new location '%s'",
- change.identifier(),
- ((MetadataUpdate.SetLocation)
singleUpdate).location());
- }
- }
-
- // Apply updates to builder
- singleUpdate.applyTo(metadataBuilder);
- });
-
- // Commit into transaction workspace we swapped the baseCatalog
to use
- TableMetadata updatedMetadata = metadataBuilder.build();
- if (!updatedMetadata.changes().isEmpty()) {
- tableOps.commit(currentMetadata, updatedMetadata);
+ // Process each table's changes in order
+ changesByTable.forEach(
Review Comment:
The new code logic looks correct to me. I think it's a worthy change to
merge in its own right.
However, as for the issue discussed in
https://github.com/apache/polaris/pull/3352#discussion_r2663070012 , I think
this fix is effective, but it's not apparent that it will work correctly.
The basic problem is that Persistence is called with multiple entity objects
for the same ID. This means that the `updateEntitiesPropertiesIfNotChanged`
call on line must contain at most one entity per ID. This may or may not be
true depending on what the catalog forwards to `transactionMetaStoreManager`
for each commit on line 1108.
I do believe that one `tableOps.commit()` will result in one entity update,
so it may be sufficient to just add a comment about that around line 1113. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]