Re: [PR] Support `replace_table` and `replace_table_transaction` [iceberg-python]

via GitHub Mon, 18 May 2026 11:44:58 -0700


smaheshwar-pltr commented on code in PR #3220:
URL: https://github.com/apache/iceberg-python/pull/3220#discussion_r3261232343



##########
mkdocs/docs/api.md:
##########
@@ -185,6 +185,45 @@ with 
catalog.create_table_transaction(identifier="docs_example.bids", schema=sch
     txn.set_properties(test_a="test_aa", test_b="test_b", test_c="test_c")
 ```
 
+## Replace a table
+
+Atomically replace an existing table's schema, partition spec, sort order, 
location, and properties. The table UUID and history (snapshots, schemas, 
specs, sort orders, metadata log) are preserved; the current snapshot is 
cleared (the `main` branch ref is removed). Use this when you want to redefine 
the table's metadata; pair it with `replace_table_transaction` to atomically 
write new data alongside the metadata change (RTAS-style).
+
+```python
+from pyiceberg.schema import Schema
+from pyiceberg.types import NestedField, LongType, StringType, BooleanType
+
+new_schema = Schema(
+    NestedField(field_id=1, name="datetime", field_type=LongType(), 
required=False),
+    NestedField(field_id=2, name="symbol", field_type=StringType(), 
required=False),
+    NestedField(field_id=3, name="active", field_type=BooleanType(), 
required=False),
+)
+catalog.replace_table(
+    identifier="docs_example.bids",
+    schema=new_schema,
+)
+```
+
+Field IDs from columns whose names appear in the previous schema are reused, 
so existing data files remain readable when the new schema is a compatible 
superset. New columns get fresh IDs above `last-column-id`.
+
+Properties passed to `replace_table` are **merged** with the existing table 
properties (your values override; existing keys you don't pass are preserved). 
To remove a property as part of the replace, use `replace_table_transaction` 
and remove it explicitly within the transaction.
+
+Use `replace_table_transaction` to stage additional changes (writes, property 
updates, schema evolution) before committing — for example, swap the schema and 
write new data atomically:
+
+```python
+with catalog.replace_table_transaction(identifier="docs_example.bids", 
schema=new_schema) as txn:
+    with txn.update_snapshot().fast_append() as snap:
+        for data_file in 
_dataframe_to_data_files(table_metadata=txn.table_metadata, df=df, 
io=txn._table.io):
+            snap.append_data_file(data_file)
+    txn.set_properties(write_replaced_at="2026-04-19T00:00:00Z")

Review Comment:
   Surely we should be showing / testing an example where you have a pyarrow 
table that you want to replace your table with, so you do 
`replace_table_transaction` with that Arrow table's schema and do append on the 
transaction to achieve it? this feels like the most common use case by far, no?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Support `replace_table` and `replace_table_transaction` [iceberg-python]

Reply via email to