[
https://issues.apache.org/jira/browse/CALCITE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881204#comment-17881204
]
david radley commented on CALCITE-6579:
---------------------------------------
[~mbudiu] yes I agree that issue looks related. From what I see in that issue,
Calcite is modelling its behaviour based on PostgreSQL (and probably other
traditional RDBs). For the Flink use case we are looking to map the SQL table
definition to an Avro schema which does allow children to have different
nullable status to a nullable parent (collection / struct / array). This Avro
pattern is commonly used in the event processing world.
I wonder regarding https://issues.apache.org/jira/browse/CALCITE-6275. if
Calcite could set the type's nullablities in collections as per the PR,
* how would Postgres fail as it does not support this SQL?
* If the children's nullable status is set "properly" for the children, then
could we just not call fixUpNullability in the Flink case, by making this
method abstract with a default of the current implementation and allow Flink to
override it, probably with a factory.
If there is appetite for this sort of solution, if so I could prototype it and
check that our Avro cases are fixed. WDYT?
> Unexpected nullable checks for nullable types with non-nullable children
> -------------------------------------------------------------------------
>
> Key: CALCITE-6579
> URL: https://issues.apache.org/jira/browse/CALCITE-6579
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.37.0
> Reporter: david radley
> Priority: Major
>
> This Jira is raised for the issue discussed in
> [https://lists.apache.org/thread/x857njnd42gcvrldbcd7y85to6fb58dn]
>
> I am using flink and have a table definition like this.
> _CREATE TABLE source_1 ( `order_id` STRING NOT NULL, `order_time` STRING NOT
> NULL, `buyer` ROW< `first_name` STRING, `last_name` STRING NOT NULL, `title`
> STRING NOT NULL_
> _) WITH ( 'connector' = 'kafka', 'topic' = 'mytopic',
> 'properties.bootstrap.servers' = 'localhost:9092', 'value.format' = 'avro',
> 'value.fields-include' = 'ALL', 'scan.startup.mode' = 'earliest-offset' );_
>
> This errors as code in Calcite does not create the correct schema.
> It sets`last_name` STRING NOT NULL, `title` STRING NOT NULL to both be
> nullable.
> The cause of this is in the table planner when we convert SqlDataTypeSpec to
> the RelDataType. We push the nullable of buyer onto all of its fields, losing
> the children’s nullable status . In the debugger I see that it is taking the
> nullable true (from buyer) and putting this on all the children in
> [fixupNullability|http://example.com]https://github.com/davidradl/calcite/blob/ad2e843c5d9b3bec001d22e680ebe6b5de4e2078/core/src/main/java/org/apache/calcite/sql/SqlDataTypeSpec.java#L238.
> I can see that the SqlDataTypeSpec has this information but it is not used.
>
> Flink maps this directly to an Avro schema that incorreclty polices the Avro
> payload.
>
> In terms of fixing this, I wonder if the type could be created with the
> correct nullable so it doesn’t need to be fixed up or whether we should pass
> down the SqlDataTypeSpec so it has the right information to set the nullable
> status.
>
> This subject has previously been raised on the dev list
> [https://lists.apache.org/thread/s4nd9rk0fzckoctokl7kjdbtfbvxncy7] by
> [~dwysakowicz]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)