openinx opened a new issue #2738:
URL: https://github.com/apache/iceberg/issues/2738
Let's say we have an iceberg schema:
```java
Schema schema = new Schema(
Types.NestedField.required(0, "id", Types.LongType.get()),
Types.NestedField.optional(3, "location", Types.StructType.of(
Types.NestedField.required(1, "lat", Types.FloatType.get()),
Types.NestedField.required(2, "long", Types.FloatType.get())
))
);
```
And if someone want to do the nested projection by using the project schema:
```java
Schema latOnly = new Schema(
Types.NestedField.optional(3, "location", Types.StructType.of(
Types.NestedField.required(1, "lat", Types.FloatType.get())
))
);
```
If the data row is :
```
{
"id": 10001,
"location": null
}
```
Then what's the expected projected value for the project schema `latOnly` ?
Should we set the `location.lat` to be null although its field are defined
`required` in `Types.NestedField.required(1, "lat", Types.FloatType.get())` ?
I think the current
[StructProjection](https://github.com/apache/iceberg/blob/90225d6c9413016d611e2ce5eff37db1bc1b4fc5/api/src/main/java/org/apache/iceberg/util/StructProjection.java#L115)
did not handle this issue correctly because it will just throw a
NullPointerException when projecting the nested required field while providing
a null value for the parent struct.
This is related to the broken unit tests from [this
PR](https://github.com/apache/iceberg/pull/2731/files#diff-8b18817c3263d1283b5c4f0f98f2201b51bec5a94a7bc0b4885a447cdcd7ccdbR104-R106).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]