[
https://issues.apache.org/jira/browse/FLINK-33217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775748#comment-17775748
]
Sergey Nuyanzin commented on FLINK-33217:
-----------------------------------------
after looking a bit deeper it looks like it is not Calcite issue
e.g.
this query
{code:sql}
with Book as (
SELECT *
FROM
(
VALUES
ROW (array[1, 2])
, ROW (array[11])
, ROW (array[22])
) Book (authorId)
) select * from Book b left join unnest(b.authorId) on true
{codei
works ok with Calcite while is failing on Flink with similar issue.
I think the reason is that In Calcite there is a dedicated {{UNNEST}} operator
[1] which for some reason is not used in Flink... Instead there is
LogicalUnnestRule [2] which tries to translates result of unnest as a table
scan and this is the place where the error happens...
Based on the code of this rule
{code:scala}
relNode match {
case rs: HepRelVertex =>
convert(getRel(rs))
case f: LogicalProject =>
...
case f: LogicalFilter =>
...
case uc: Uncollect =>
...
}
{code}
there could be 4 different types of cases failing with same or similar error
while join unnest.
Current thoughts about how to fix it
1. Move to Calcite's Unnest operator (however that's still not clear what was
the reason to not use it...)
2. Since while parsing and building AST and while also convertion Calcite
converts {{LEFT JOIN}} to something that has nullable type pn the left and this
is also the reason, we could add convertion to not do it for {{LEFT JOIN
UNNEST}}
3. We could try handling this in {{LogicalUnnestRule}} by making types broader
like force nullables... however it could lead wrong final types (e.g. nullable
instead of not nullable)
[1]
https://github.com/apache/calcite/blob/bf56743554ea27d250d41db2eb83806f9e626b55/core/src/main/java/org/apache/calcite/sql/SqlUnnestOperator.java#L34
[2]
https://github.com/apache/flink/blob/91d81c427aa6312841ca868d54e8ce6ea721cd60/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/rules/logical/LogicalUnnestRule.scala#L52
> Flink SQL: UNNEST fails with on LEFT JOIN with NOT NULL type in array
> ---------------------------------------------------------------------
>
> Key: FLINK-33217
> URL: https://issues.apache.org/jira/browse/FLINK-33217
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / Planner
> Affects Versions: 1.15.3, 1.18.0, 1.19.0
> Reporter: Robert Metzger
> Priority: Major
> Attachments: UnnestNullErrorTest.scala
>
>
> Steps to reproduce:
> Take a column of type
> {code:java}
> business_data ARRAY<STRING NOT NULL>
> {code}
> Take this query
> {code:java}
> select bd_name from reproduce_unnest LEFT JOIN
> UNNEST(reproduce_unnest.business_data) AS exploded_bd(bd_name) ON true
> {code}
> And get this error
> {code:java}
> Caused by: java.lang.AssertionError: Type mismatch:
> rowtype of rel before registration: RecordType(VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE" NOT NULL ARRAY business_data, VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE" bd_name) NOT NULL
> rowtype of rel after registration: RecordType(VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE" NOT NULL ARRAY business_data, VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE" NOT NULL f0) NOT NULL
> Difference:
> bd_name: VARCHAR(2147483647) CHARACTER SET "UTF-16LE" -> VARCHAR(2147483647)
> CHARACTER SET "UTF-16LE" NOT NULL
> at org.apache.calcite.util.Litmus$1.fail(Litmus.java:32)
> at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:2206)
> at
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:275)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1270)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:598)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:613)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.changeTraits(VolcanoPlanner.java:498)
> at
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:315)
> at
> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:62)
> {code}
> I have implemented a small test case, which fails against Flink 1.15, 1.8 and
> the latest master branch.
> Workarounds:
> 1. Drop "NOT NULL" in array type
> 2. Drop "LEFT" from "LEFT JOIN".
--
This message was sent by Atlassian Jira
(v8.20.10#820010)