Hi all,
I'd like to present two Calcite-related problems recently reported in the
Flink community.
1. Using NULL literal causes NPE.
It seems that the constant NULL in Calcite is represented as a RexLiteral
with a (null: Comparable) value. In RexUtil.gatherConstraint(), the
equals() method is invoked by the value returned by (NULL: RexLiteral),
which is null, and that causes the NPE.
2. The TableFunction left outer join works incorrectly.
For instance, given a simple table {WordCount(word:String, frequency:Int)},
a table function {split: word:String => (letter:String, length:String)},
and a query "SELECT word, letter, length FROM WordCount LEFT JOIN LATERAL
TABLE(split(word)) AS T (letter, length) ON frequency = length OR length <
5", the query will be translated to the logical plan below.
LogicalProject(word=[$0], name=[$2], length=[$3])
LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
LogicalCorrelate(correlation=[$cor0], joinType=[left],
requiredColumns=[{0}])
LogicalTableScan(table=[[WordCount]])
LogicalTableFunctionScan(invocation=[split($cor0.word)],
rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class
[Ljava.lang.Object;])
This logical plan may lead to an improper physical plan, which first
correlates each row with its table function results (just like performing a
cartesian product) and then filters the rows. IMO, it only works for inner
join, but not for left outer join.
We are not sure whether they are real problems or the design is just like
that. I wonder if someone could help give a more authoritative explanation
about these problems.
Thanks,
Xingcan