HonahX commented on code in PR #758:
URL: https://github.com/apache/iceberg-python/pull/758#discussion_r1643598048
##########
pyiceberg/table/__init__.py:
##########
@@ -2010,6 +2016,84 @@ def create_branch(
self._requirements += requirement
return self
+ def rollback_to_snapshot(self, snapshot_id: int) -> ManageSnapshots:
+ """Rollback the table to the given snapshot id.
+
+ The snapshot needs to be an ancestor of the current table state.
+
+ Args:
+ snapshot_id (int): rollback to this snapshot_id that used to be
current.
+ Returns:
+ This for method chaining
+ """
+ self._commit_if_ref_updates_exist()
+ if self._transaction._table.snapshot_by_id(snapshot_id) is None:
+ raise ValidationError(f"Cannot roll back to unknown snapshot id:
{snapshot_id}")
+ if snapshot_id not in {
+ ancestor.snapshot_id
+ for ancestor in
ancestors_of(self._transaction._table.current_snapshot(),
self._transaction.table_metadata)
+ }:
+ raise ValidationError(f"Cannot roll back to snapshot, not an
ancestor of the current state: {snapshot_id}")
+
+ update, requirement =
self._transaction._set_ref_snapshot(snapshot_id=snapshot_id, ref_name="main",
type="branch")
+ self._updates += update
+ self._requirements += requirement
+ return self
+
+ def rollback_to_timestamp(self, timestamp: int) -> ManageSnapshots:
+ """Rollback the table to the snapshot right before the given timestamp.
+
+ The snapshot needs to be an ancestor of the current table state.
+
+ Args:
+ timestamp (int): rollback to the snapshot that used to be current
right before this timestamp.
+ Returns:
+ This for method chaining
+ """
+ self._commit_if_ref_updates_exist()
+ if (
+ snapshot := ancestor_right_before_timestamp(
+ self._transaction._table.current_snapshot(),
self._transaction.table_metadata, timestamp
Review Comment:
Thanks for the explanation! Yes, you are right. The `rollback` operation has
the requirement to rollback to ancestors of the current table state. While
reviewing the previous PR, I was focusing on the general use-case of getting
most recent snapshot for a timestamp while overlooking the additional
requirement of `rollback`.
It is great to have another helper method like
`ancestor_right_before_timestamp()` for `rollback`. `snapshot_as_of_timestamp`
will still be useful in cases that not require "ancestors of current table
state".
Thanks for making `rollback` and `set_current_snapshot` quickly. I will do a
full review ASAP.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]