The scoping rule is very simple - a relational expression can only see its input relational expression(s). It can’t see the input of its input, or its sibling. So $n is the n’th column of the input.
For instance, the DEPTNO field might be $3 coming out of a Scan of the EMP table, would still be $3 after passing through a Filter, and might be $1 after passing through a Project. A Join has two inputs, so you imagine the input columns laid side by side. If the left has 3 and the right has 4, then the valid references are $0, $1, $2 reading from the left input and $3, $4, $5, $6 reading from the right. You need to know how each operator maps the fields in the input row(s) to an output row. * Most operators (Filter, Sort, Exchange, SemiJoin), output a row that has the same fields as the input. * The set operators (Union, Intersect, Minus) output a row that matches their first input (in fact they require that all inputs have the same row type). * Join outputs the fields from the left input followed by the fields from the right input. Some fields become nullable if it is a left, right or full outer join. Correlate is similar to Join. * The leaf operators, Scan and Values, don’t have an input but they just output their data. * Project outputs one column for each expression. * Aggregate outputs the group keys, followed optionally by the indicators, followed by the aggregate functions. Expressions are also allowed to read correlating variables, called $cor0, $cor1 etc. which are assigned by a relational expression elsewhere in the graph. But they are an advanced topic that don’t crop up in many queries. Julian > On Jan 21, 2016, at 7:30 PM, Homer <[email protected]> wrote: > > Hi, > > When using calcite to generate a plan I am trying to understand scope of the > $n variables > > I have a simple table > > tab 3 table is a super simple table > > i1 integer > i2 integer > t3 text > t4 text > > if I do > > select i1 as fred from tab3; > > I get > > LogicalProject(fred=[$0]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > > if I look at my catalog layout I see that i1 is the first column in the > catalog for this table thus the $0. make sense. > > If I give calcite a kind of silly query for testing and trying to understand > the scoping of the result it expects me to execute it gets a little more > confusing > > select count(*) from tab3 as alias1 where (select max(alias2.i1) from tab3 > alias2, tab3 as alias3 where alias2.i1 = alias3.i1) > alias1.i1; > > the execution plan I get back looks like this: > > LogicalAggregate(group=[{}], fred=[COUNT()]) > LogicalProject($f0=[0]) > LogicalFilter(condition=[>($5, $0)]) > LogicalJoin(condition=[true], joinType=[left]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) > LogicalProject(i1=[$0]) > LogicalFilter(condition=[=($0, $5)]) > LogicalJoin(condition=[true], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > > My question is about understanding the mapping and scoping of the $n > variables in the plan > > The plan uses $0 and $5 in various different scopes here, initially in this > part > > LogicalProject(i1=[$0]) > LogicalFilter(condition=[=($0, $5)]) > LogicalJoin(condition=[true], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > > I understand it as the two scans are of the same table, so $0 for alias2.i1 > $5 for the second table alias3.i1. > > we then move into my shakey ground > > LogicalAggregate(group=[{}], fred=[COUNT()]) > LogicalProject($f0=[0]) > LogicalFilter(condition=[>($5, $0)]) > LogicalJoin(condition=[true], joinType=[left]) > LogicalTableScan(table=[[CATALOG, mapd, tab3]]) > LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) > > > We see another join here referring to $5 and $0 but these are different $5 > and $0. In this scope I think $5 is referring to the MAX() projection from > below and $0 is referring to alias1.i1 > > I am wondering what is the best way to consider the scoping of these $n > placeholders. What is the best way to think of this model? At what point > does $0 and $5 move to the new meaning? > > thanks > Homer > > > > > > >
