Re: [PR] Core, Spark: Fix equality deletes non-deterministic schema ordering (#13873) [iceberg]

via GitHub Fri, 06 Mar 2026 11:08:34 -0800


rdblue commented on code in PR #15514:
URL: https://github.com/apache/iceberg/pull/15514#discussion_r2897390811



##########
data/src/main/java/org/apache/iceberg/data/DeleteFilter.java:
##########
@@ -210,7 +211,12 @@ private List<Predicate<T>> applyEqDeletes() {
       Set<Integer> ids = entry.getKey();
       Iterable<DeleteFile> deletes = entry.getValue();
 
-      Schema deleteSchema = TypeUtil.select(requiredSchema, ids);
+      // Canonicalize the delete schema by sorting fields by ID so that the 
same set of equality
+      // field IDs always produces the same schema, regardless of 
requiredSchema column ordering.
+      Schema selectedSchema = TypeUtil.select(requiredSchema, ids);
+      List<Types.NestedField> sortedFields = 
Lists.newArrayList(selectedSchema.columns());
+      sortedFields.sort(Comparator.comparingInt(Types.NestedField::fieldId));
+      Schema deleteSchema = new Schema(sortedFields);

Review Comment:
   Minor: It doesn't seem like this logic needs to be here. It's basically 
saying that we want `select` to produce a schema that matches the order of 
`ids` rather than the order of the schema we are projecting. I think that 
sounds more like a `TypeUtil` helper than just embedding code here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Core, Spark: Fix equality deletes non-deterministic schema ordering (#13873) [iceberg]

Reply via email to