dawidwys edited a comment on pull request #12649:
URL: https://github.com/apache/flink/pull/12649#issuecomment-644565539
Let me explain what the changes are about. They are not trying to fix the
nullability of the record type in all operators, but in my opinion it actually
improves the behaviour described in the thread you posted (thank you for it, it
is actually a good read).
First of all we do very much want to respect the original nullability,
that's the main purpose of those changes and also why we have a special
handling in `FlinkTypeFactory:348`. We have that because Calcite actually
modifies nested columns of a `ROW` type, if the type is nullable.
In calcite a type `ROW<i INT NOT NULL>` becomes `ROW<i INT>` See
`org.apache.calcite.rel.type.RelDataTypeFactoryImpl#createTypeWithNullability:326-338`.
This behaviour is especially problematic for STRUCTURED types, which
correspond to java's pojos. Let's take a look at an example:
```
class Address {
int id;
String city;
}
```
such a pojo we would like to map to a `Address(id INT NOT NULL, city
STRING)`, where the `Address` type itself (outer row) is nullable, but the
nested `id` field is `NOT NULL`. This is not possible in Calcite by default.
Thats why we have the changes in
`org.apache.calcite.rel.type.RelDataTypeFactoryImpl#createTypeWithNullability:326-338`.
On the other hand if we access a nested field of a nullable composite type
we do want to adjust the nullability, cause accessing a field of a null row
should obviously produce null values. If we do
```
CREATE TABLE test (
address Address
);
// the type should be nullable INT, if the outer row is null we can not
produce a NOT NULL id
SELECT t.address.id FROM test t;
```
BTW this behaviour is already present in `SqlDotOperator#inferReturnType`,
but as far as I can tell that method is not used.
To summarize it again. I do very much want to respect the nullability of the
record attributes. Thats the main goal of the changes. Calcite as of now does
not respect the nullability of nested fields, which we had to fix in
FLINK-16344. The changes here affect the return type of field accessors, not
the record itself.
I hope it gives a better overview what are the changes. If you have any
questions let me know, I will try to elaborate more.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]