qinghui-xu commented on issue #15826: URL: https://github.com/apache/iceberg/issues/15826#issuecomment-4171400840
Hello folks, Thanks for your replies! Indeed after digging a little bit, I think the root cause here is the `StructType#field(int id)` only looks up on the first level fields' id, as @bharos suggested. This is somehow "mitigated" for Spark 4.1 with the [PR](https://github.com/apache/iceberg/pull/15268) because it introduces the [`FieldLookup`](https://github.com/apache/iceberg/blob/main/spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/source/BaseReader.java#L259) which uses the recursive field id lookup from `Schema#findField(int id)`. But problem still exists for older versions such as Spark 3.5. So now the question is that should we backport this change to other Spark versions, or should we make the `StructType#field(int id)` recursive? Personally I'd prefer the latter as it should make the code more robust and seems to address [the TODO comment on DeleteFilter.java](https://github.com/apache/iceberg/blob/da0ad6a7cc2632594f04cb7872bc6f5fb158fd6b/data/src/main/java/org/apache/iceberg/data/DeleteFilter.java#L305). I can send a PR with this approach, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
