kevinjqliu commented on code in PR #2871:
URL: https://github.com/apache/iceberg-python/pull/2871#discussion_r2663963235
##########
pyiceberg/table/update/snapshot.py:
##########
@@ -843,6 +843,13 @@ def _commit(self) -> UpdatesAndRequirements:
"""Apply the pending changes and commit."""
return self._updates, self._requirements
+ def _commit_if_ref_updates_exist(self) -> None:
+ """Commit any pending ref updates to the transaction."""
+ if self._updates:
+ self._transaction._apply(*self._commit(),
commit_transaction_if_autocommit=False)
+ self._updates = ()
+ self._requirements = ()
Review Comment:
I was looking up how the java side implemented this functionality,
https://github.com/apache/iceberg/blob/bc7bfa5de4743853d9647ad095322ba71e304221/core/src/main/java/org/apache/iceberg/SnapshotManager.java#L159-L172
It seems that the updates are applied to the transaction state (but not yet
to the catalog's table state, i.e. commit to the catalog). The other operations
can reference the new transaction state.
I think adding `commit_transaction_if_autocommit` here can be confusing
since we're toggling commit behavior through `_autocommit` but then disabling
it using `commit_transaction_if_autocommit`
Maybe a better approach can be to create a new function in the Transaction
class called `_stage`. And we can call that to update the transaction state
without updating the catalog. And maybe we can address this refactor + chaining
in a follow up PR
##########
pyiceberg/table/update/snapshot.py:
##########
@@ -941,6 +948,44 @@ def remove_branch(self, branch_name: str) ->
ManageSnapshots:
"""
return self._remove_ref_snapshot(ref_name=branch_name)
+ def set_current_snapshot(self, snapshot_id: int | None = None, ref_name:
str | None = None) -> ManageSnapshots:
+ """Set the current snapshot to a specific snapshot ID or ref.
+
+ Args:
+ snapshot_id: The ID of the snapshot to set as current.
+ ref_name: The snapshot reference (branch or tag) to set as current.
+
+ Returns:
+ This for method chaining.
+
+ Raises:
+ ValueError: If neither or both arguments are provided, or if the
snapshot/ref does not exist.
+ """
+ self._commit_if_ref_updates_exist()
+
+ if (snapshot_id is None) == (ref_name is None):
+ raise ValueError("Either snapshot_id or ref_name must be provided,
not both")
+
+ target_snapshot_id: int
+ if snapshot_id is not None:
+ target_snapshot_id = snapshot_id
+ else:
+ if ref_name not in self._transaction.table_metadata.refs:
+ raise ValueError(f"Cannot find matching snapshot ID for ref:
{ref_name}")
+ target_snapshot_id =
self._transaction.table_metadata.refs[ref_name].snapshot_id
+
+ if self._transaction.table_metadata.snapshot_by_id(target_snapshot_id)
is None:
+ raise ValueError(f"Cannot set current snapshot to unknown snapshot
id: {target_snapshot_id}")
+
+ update, requirement = self._transaction._set_ref_snapshot(
+ snapshot_id=target_snapshot_id,
+ ref_name=MAIN_BRANCH,
+ type="branch",
Review Comment:
```suggestion
type=SnapshotRefType.BRANCH,
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]