Maybe you could use RelFieldTrimmer to do column pruning. Best, LakeShen
GLeonSun <gleon...@163.com> 于2023年9月11日周一 19:13写道: > Hi everyone, SQL like this cannot be optimized for column cropping, is > this behavior a known defect? > > > — > select > count(1) > from > ( > select > c_custkey, > C_REGION > from > SSB_STEP.CUSTOMER > ) as A > left join ( > select > lo_suppkey + 1 as lo_suppkey, > LO_SHIPMODE, > LO_COMMITDATE > from > SSB_STEP.LINEORDER > ) as B on A.c_custkey = B.lo_suppkey > LIMIT > 500 > — > > > The logical plan is as follows, at some point in the optimization > RelBuilder#aggregate(GroupKey, Iterable<AggCall>) removes the ProjectRel > that doesn't contain fields. So ProjectTransposeRule can't work between > LogicalProject and LogicalJoin and prepare for the subsequent > ProjectMergeRule, which results in querying a few more columns: C_REGION, > LO_SHIPMODE, and LO_COMMITDATE, which really don't need to be queried > semantically, so it's not the right way to behave, is it? > I have found in the version of Calcite that does not include the removal > of ProjectRel (empty fields) that it is possible to achieve the > optimization of the cropped columns, so please let me know if this is the > case as a known defect or if it will be fixed in the future. > > > — > LogicalSort(fetch=[500]) > LogicalAggregate(group=[{}], EXPR$0=[COUNT()]) > LogicalJoin(condition=[=($0, $2)], joinType=[left]) > LogicalProject(C_CUSTKEY=[$0], C_REGION=[$5]) > OlapTableScan(table=[[SSB_STEP, CUSTOMER]], ctx=[], fields=[[0, 1, > 2, 3, 4, 5, 6, 7]]) > LogicalProject(LO_SUPPKEY=[+($4, 1)], LO_SHIPMODE=[$16], > LO_COMMITDATE=[$15]) > OlapTableScan(table=[[SSB_STEP, LINEORDER]], ctx=[], fields=[[0, > 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]]) > — > > > Best Regards