In think this is a current limitation of FieldTrimmer. The Join and Filter nodes can't drop columns (since they don't carry column selection information), and the trimmer doesn't add Project nodes (currently). I have worked around this limitation by using HepPlanner with various ProjectTranspose rules.
I think you could work around this by always inserting trivial projects over every node in the tree before trimming, and then clean up with ProjectRemoveRule. On Tue, Mar 4, 2025 at 1:33 PM Ian Bertolacci <ian.bertola...@workday.com.invalid> wrote: > I’m looking at using RelFieldTrimmer, and I’m noticing that if a side of a > join has unnecessary fields after a filter, there is no trim-fields project > on that side to reduce the width of the row. > Is this expected, or is there a configuration or pre-processing step that > I am missing? > > For example, starting with this tree (these all look better in monospace, > hopefully the formatting comes through) > 4:Project(C5633_14509=[$4], C5633_486=[$8]) > └── 3:Join(condition=[=($1, $6)], joinType=[inner]) > ....├── 1:Filter(condition=[<($2, 10)]) > ....│...└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > The result of RelFieldTrimmer is this: > 9:Project(C5633_14509=[$2], C5633_486=[$4]) > └── 8:Join(condition=[=($0, $3)], joinType=[inner]) > ....├── 6:Filter(condition=[<($1, 10)]) > ....│...└── 5:Project(C5633_14505=[$1], C5633_14506=[$2], C5633_14509=[$4]) > ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 7:Project(ID=[$0], C5633_486=[$2]) > ........└── 2:TableScan(table=[T895], Schema=[...64 fields...]) > > Notice: $1 on the LHS of the node is not used *after* the filter so a > projection of only the $0 and $2 fields would be reduce the width of the > row before the join. > > However, I can force the insertion of a projection which is simply the > identity (ie, projecting all fields of the input row with now additions or > subtractions): > 5:Project(C5633_14509=[$4], C5633_486=[$8]) > └── 4:Join(condition=[=($1, $6)], joinType=[inner]) > ....├── 2:Project(...Identity mapping, 6 fields...) > ....│...└── 1:Filter(condition=[<($2, 10)]) > ....│.......└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > And the result is a projection wich only has the 2 fields necessary after > the filter. > 11:Project(C5633_14509=[$1], C5633_486=[$3]) > └── 10:Join(condition=[=($0, $2)], joinType=[inner]) > ....├── 8:Project(C5633_14505=[$0], C5633_14509=[$2]) <- trimmed > ....│...└── 7:Filter(condition=[<($1, 10)]) > ....│.......└── 6:Project(C5633_14505=[$1], C5633_14506=[$2], > C5633_14509=[$4]) > ....│...........└── 0:TableScan(table=[T902], Schema=[...6 fields...]) > ....└── 9:Project(ID=[$0], C5633_486=[$2]) > ........└── 3:TableScan(table=[T895], Schema=[...64 fields...]) > > Thanks! > -Ian >