Ok this makes sense, it's not really a bug, then we fall into the option 2.
When you mention column pruning is there a specific rule TransformationRule you have in mind? https://github.com/apache/calcite/tree/main/core/src/main/java/org/apache/calcite/rel/rules On Tue, Oct 3, 2023 at 3:59 PM Julian Hyde <[email protected]> wrote: > Well, it’s not really “there”. It is removed by the final project. If you > want to remove it earlier, do column pruning. > > (We aren’t aggressive with column pruning because it will introduce extra > projects into the plan, therefore make the plan more complex, and increase > the planner’s search space.) > > When we generate SQL, most people would prefer to see ’SELECT *’ in the > intermediate subqueries rather than a more focused list of columns, because > the ’SELECT *’ form is shorter. We assume that the target SQL system will > do column pruning and therefore the queries are equivalent. > > > On Oct 3, 2023, at 3:53 PM, Guillaume Masse < > [email protected]> wrote: > > > > It's because column2 is a column I created artificially to allow the > > validation of a query. I normally perform a transformation (RelNode => > > RelNode) that removes any reference to that column. The problem is that > > when I convert this Logical Plan back to SQL, it contains references to > > that column and the execution engine should not see that column and > throws > > an error. > > > > LogicalProject(column1=[$1], column10=[$4]) > > LogicalJoin(condition=[=($2, $4)], joinType=[inner]) > > LogicalProject(column2=[$0], column1=[$1], $f2=[LOWER($1)]) > > LogicalTableScan(table=[[t1]]) > > LogicalTableScan(table=[[t2]]) > > > > My problem is the added field column2=[$0] in the LogicalProject. Since > > it's not contributing to the query, it should not be there. > > > > On Tue, Oct 3, 2023 at 3:24 PM Julian Hyde <[email protected]> > wrote: > > > >> Are you worried that $3 in the first plan has become $4 in the second > plan? > >> > >> That’s easily explained because the intermediate table has an extra > >> column, LOWER($1). That pushes up the offset of all columns coming from > the > >> right-hand table. > >> > >>> On Oct 3, 2023, at 3:14 PM, Guillaume Masse < > >> [email protected]> wrote: > >>> > >>> Hi All, > >>> > >>> I have two tables: t1 and t2 both have a column1 of type t and a > column2 > >>> that I'm not using > >>> I have a function f: t -> t defined in the catalog > >>> > >>> After validation if I print the logical plan I get the following > result: > >>> > >>> select > >>> t1.column1, > >>> t2.column1 > >>> from company_data."1" t1 > >>> join company_data."2" t2 > >>> on t1.column1 = t2.column1 > >>> > >>> LogicalProject(column1=[$1], column10=[$3]) > >>> LogicalJoin(condition=[=($1, $3)], joinType=[inner]) > >>> LogicalTableScan(table=[[t1]]) > >>> LogicalTableScan(table=[[t2]]) > >>> > >>> > >>> select > >>> t1.column1, > >>> t2.column1 > >>> from company_data."1" t1 > >>> join company_data."2" t2 > >>> on lower(t1.column1) = t2.column1 > >>> > >>> LogicalProject(column1=[$1], column10=[$4]) > >>> LogicalJoin(condition=[=($2, $4)], joinType=[inner]) > >>> LogicalProject(column2=[$0], column1=[$1], $f2=[LOWER($1)]) > >>> LogicalTableScan(table=[[t1]]) > >>> LogicalTableScan(table=[[t2]]) > >>> > >>> > >>> I'm a bit surprise by the second logical plan because it projects > >> column2. > >>> > >>> 1) I'm wondering if this is a bug. > >>> > >>> 2) If it's not a bug, let's say it was injected by the validator to > >>> simplify the validation process, is there a way to remove it? It feels > a > >>> bit weird, because I would have to rewrite all the offsets when I'm > >>> reconstructing the tree: > >>> > >>> LogicalProject(column1=[$0], column10=[$3]) > >>> LogicalJoin(condition=[=($1, $3)], joinType=[inner]) > >>> LogicalProject(column1=[$0], $f2=[LOWER($1)]) > >>> LogicalTableScan(table=[[t1]]) > >>> LogicalTableScan(table=[[t2]]) > >>> > >>> > >>> Thank You! > >>> > >>> -- > >>> Guillaume Massé > >>> [Gee-OHM] > >>> (马赛卫) > >> > >> > >
