kevinjqliu commented on code in PR #2866:
URL: https://github.com/apache/iceberg-python/pull/2866#discussion_r2649878625


##########
pyiceberg/table/update/__init__.py:
##########
@@ -179,13 +179,9 @@ class SetStatisticsUpdate(IcebergBaseModel):
         description="snapshot-id is **DEPRECATED for REMOVAL** since it 
contains redundant information. Use `statistics.snapshot-id` field instead.",
     )
 
-    @model_validator(mode="before")
-    def validate_snapshot_id(cls, data: dict[str, Any]) -> dict[str, Any]:
-        stats = cast(StatisticsFile, data["statistics"])
-
-        data["snapshot_id"] = stats.snapshot_id
-
-        return data
+    @model_validator(mode="after")
+    def validate_snapshot_id(self) -> Self:
+        return self.model_copy(update={"snapshot_id": 
self.statistics.snapshot_id})

Review Comment:
   ```suggestion
       @model_validator(mode="after")
       def validate_snapshot_id(self) -> Self:
           self.snapshot_id = self.statistics.snapshot_id
           return self
   ```
   nit: use direct assignment



##########
pyiceberg/table/update/__init__.py:
##########


Review Comment:
   add a test, something like:
   
   ```
   def test_set_statistics_update_without_snapshot_id(table_v2_with_statistics: 
Table) -> None:
       """Test that SetStatisticsUpdate auto-populates snapshot_id from 
statistics."""
       snapshot_id = table_v2_with_statistics.metadata.current_snapshot_id
   
       blob_metadata = BlobMetadata(
           type="apache-datasketches-theta-v1",
           snapshot_id=snapshot_id,
           sequence_number=2,
           fields=[1],
           properties={"prop-key": "prop-value"},
       )
   
       statistics_file = StatisticsFile(
           snapshot_id=snapshot_id,
           statistics_path="s3://bucket/warehouse/stats.puffin",
           file_size_in_bytes=124,
           file_footer_size_in_bytes=27,
           blob_metadata=[blob_metadata],
       )
   
       # Create update without explicitly providing snapshot_id
       update = SetStatisticsUpdate(statistics=statistics_file)
   
       # Validator should auto-populate snapshot_id from statistics
       assert update.snapshot_id == snapshot_id
   
       new_metadata = update_table_metadata(
           table_v2_with_statistics.metadata,
           (update,),
       )
   
       assert len(new_metadata.statistics) == 2
       updated_statistics = [stat for stat in new_metadata.statistics if 
stat.snapshot_id == snapshot_id]
       assert len(updated_statistics) == 1
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to