rdblue commented on a change in pull request #2953:
URL: https://github.com/apache/iceberg/pull/2953#discussion_r707615179
##########
File path: api/src/main/java/org/apache/iceberg/types/TypeUtil.java
##########
@@ -130,13 +130,13 @@ public static Schema select(Schema schema, Set<Integer>
fieldIds) {
public static Types.StructType selectNot(Types.StructType struct,
Set<Integer> fieldIds) {
Set<Integer> projectedIds = getIdsInternal(struct);
projectedIds.removeAll(fieldIds);
- return select(struct, projectedIds);
+ return project(struct, projectedIds);
Review comment:
I think I agree with the decision to not change the behavior of this
method, even though the opposite of "select" behavior would be to fully remove
a struct when its ID is passed in `fieldIds`.
But I don't think that `project` is quite correct either. Consider the
example schema `1: id bigint, 2: location struct<3: lat double, 4: long
double>`. Previously, `selectNot(t, set(3, 4))` would produce `1: id bigint`
and omit the location entirely. Using project with the updated
`GetProjectedIds`, the projected ID set will be {1, 2, 3, 4} and not `{1, 3,
4}`. That would result in the same call producing `1: id bigint, 2: location
struct<>`, which introduces a new bug because now there is an unexpected extra
field.
To clean this up, I think we need a version of `GetProjectedIds` that
doesn't select structs and uses the old behavior.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]