Hi, folks
I have a question about org.apache.calcite.rel.RelNode#getVariablesSet.
Javadoc says, it returns variables that are set by current node:
/**
* Returns the variables that are set in this relational
* expression but also used and therefore not available to parents of this
* relational expression.
*
* <p>Note: only {@link org.apache.calcite.rel.core.Correlate} should set
* variables.
*
* @return Names of variables which are set in this relational
* expression
*/
Set<CorrelationId> getVariablesSet();
But I've got a plan where node returns all variables used by children nodes
regardless this variable are set by current or parent node.
Original query is:
SELECT *
FROM t1 as "outer"
WHERE a > (
SELECT COUNT(*)
FROM t1 as "inner"
WHERE "inner".a IN (
SELECT *
FROM table(system_range("inner".a, "inner".b + "outer".b))
)
)
After SQL to Rel translation I've got plan as follow:
LogicalProject(A=[$2], B=[$3], C=[$4], D=[$5], E=[$6])
LogicalFilter(condition=[>($2, $SCALAR_QUERY({
LogicalAggregate(group=[{}], COUNT(*)=[COUNT()])
LogicalFilter(condition=[IN($2, {
LogicalProject(X=[$0])
LogicalTableFunctionScan(invocation=[SYSTEM_RANGE($cor0.A,
+($cor0.B, $cor2.B))], rowType=[RecordType(BIGINT X)])
})], variablesSet=[[$cor0]])
LogicalTableScan(table=[[PUBLIC, T1]])
}))], variablesSet=[[$cor2]])
LogicalTableScan(table=[[PUBLIC, T1]])
Every LogicalFilter introduce its own correlation variable, and everything is
OK so far.
But then I apply SubQueryRemoveRule and new plan looks like this:
LogicalProject(A=[$2], B=[$3], C=[$4], D=[$5], E=[$6])
LogicalProject(_KEY=[$0], _VAL=[$1], A=[$2], B=[$3], C=[$4], D=[$5], E=[$6])
LogicalFilter(condition=[>($2, $7)])
LogicalCorrelate(correlation=[$cor2], joinType=[left],
requiredColumns=[{3}])
LogicalTableScan(table=[[PUBLIC, T1]])
LogicalAggregate(group=[{}], COUNT(*)=[COUNT()])
LogicalProject(_KEY=[$0], _VAL=[$1], A=[$2], B=[$3], C=[$4], D=[$5],
E=[$6])
LogicalJoin(condition=[=($2, $7)], joinType=[inner])
LogicalTableScan(table=[[PUBLIC, T1]])
LogicalAggregate(group=[{0}])
LogicalProject(X=[$0])
LogicalTableFunctionScan(invocation=[SYSTEM_RANGE($cor0.A,
+($cor0.B, $cor2.B))], rowType=[RecordType(BIGINT X)])
At this point LogicalJoin.getVariablesSet() returns both "cor0" and "cor2"
variables which doesn't seem right.
Is such behaviour expected or it is a bug?
--
Regards,
Konstantin Orlov