dawidwys edited a comment on pull request #12649:
URL: https://github.com/apache/flink/pull/12649#issuecomment-644565539


   Let me explain what the changes are about. They are not trying to fix the 
nullability of the record type in all operators, but in my opinion it actually 
improves the behaviour described in the thread you posted (thank you for it, it 
is actually a good read).
   
   First of all we do very much want to respect the original nullability, 
that's the main purpose of those changes and also why we have a special 
handling in `FlinkTypeFactory:348`. We have that because Calcite actually 
modifies nested columns of a `ROW` type, if the type is nullable.
   
   In calcite a type `ROW<i INT NOT NULL>` becomes `ROW<i INT>` See 
`org.apache.calcite.rel.type.RelDataTypeFactoryImpl#createTypeWithNullability:326-338`.
 This behaviour is especially problematic for STRUCTURED types, which 
correspond to java's pojos. Let's take a look at an example:
   ```
   class Address {
     int id;
     String city;
   }
   ```
   such a pojo we would like to map to a `Address(id INT NOT NULL, city 
STRING)`, where the `Address` type itself (outer row) is nullable, but the 
nested `id` field is `NOT NULL`. This is not possible in Calcite by default. 
Thats why we have the changes in 
`org.apache.calcite.rel.type.RelDataTypeFactoryImpl#createTypeWithNullability:326-338`.
   
   On the other hand if we access a nested field of a nullable composite type 
we do want to adjust the nullability, cause accessing a field of a null row 
should obviously produce null values. If we do
   ```
   CREATE TABLE test (
     address Address
   );
   
   // the type should be nullable INT, if the outer row is null we can not 
produce a NOT NULL id
   SELECT t.address.id FROM test t;
   ```
   
   BTW this behaviour is already present in `SqlDotOperator#inferReturnType`, 
but as far as I can tell that method is not used.
   
   To summarize it again. I do very much want to respect the nullability of the 
record attributes. Thats the main goal of the changes. Calcite as of now does 
not respect the nullability of nested fields, which we had to fix in 
FLINK-16344. The changes here affect the return type of field accessors, not 
the record itself.
   
   I hope it gives a better overview what are the changes. If you have any 
questions let me know, I will try to elaborate more.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to