I think the goal is not to preserve column names in all RelNodes when a plan is rewritten; this is not even well-defined, since there is no 1-1 correspondence between Rel nodes in a new plan and in an old plan.
The goal is for every rewrite rule to produce a plan with the exact same output ROW type, including column names. I think this goal is certainly achievable; there should be an assert after rule application that this holds. At the very least this can be done by inserting a Project which renames columns to their original names. This is similar to https://issues.apache.org/jira/browse/CALCITE-7058 "Decorrelator may produce different column names". I think that this goal can even be achieved for the decorrelator itself. Mihai ________________________________ From: Julian Hyde <jhyde.apa...@gmail.com> Sent: Wednesday, June 25, 2025 7:59 AM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: [DISCUSS] Preserving Output Alias Names After RelNode Optimization Preserving column names through the optimization process, and many rewrite rules being applied, is very hard if not impossible. Instead I would approach this as the RelToSqlConverter does, and try to produce the best/most concise/most human-readable SQL possible given a RelRoot. Flattening the subquery that is generated to project/rename the output columns of a Sort seems a reasonable thing to do for RelToSqlConverter to do. My approach would be to write a unit test and torture the code until it passes. :) > On Jun 25, 2025, at 3:07 AM, Yanjing Wang <zhuangzixiao...@gmail.com> wrote: > > Hi all, I'd like to discuss a challenge regarding the preservation of > column aliases after RelNode optimization in Apache Calcite. Let me outline > the specific problem and potential approaches. Problem Statement: > Currently, when applying RelNode optimization rules, Calcite doesn't > preserve the original output column aliases. While using RelRoot seems like > a potential solution, it introduces complications when the optimal RelNode > is a Sort or similar node. Consider this scenario with a best rel: ` SELECT > column1, ... FROM ... ORDER BY column1 DESC ` If we try to preserve aliases > using RelRoot, it might generate SQL like: ` SELECT column1 AS alias1, ... > FROM ( SELECT ... ORDER BY column1 DESC ) t1 ` This transformation can > break the ORDER BY clause functionality in many compute engines. Questions > for Discussion: 1. Is there an existing mature solution in Calcite for > maintaining output alias consistency after optimization? 2. If not, what > would be the recommended approach when dealing with Sort nodes (or similar > RelNodes) that could be affected by RelRoot-based alias preservation? 3. If > we implement a new solution, should we introduce additional RelOptRules to > optimize the resulting RelNode structure? This would ensure we maintain > both alias consistency and query performance. > I'd appreciate your thoughts and suggestions on this matter, especially > from those who have encountered similar challenges. Best regards, Yanjing > Wang