I think the goal is not to preserve column names in all RelNodes when a plan is 
rewritten; this is not even well-defined, since there is no 1-1 correspondence 
between Rel nodes in a new plan and in an old plan.

The goal is for every rewrite rule to produce a plan with the exact same output 
ROW type, including column names. I think this goal is certainly achievable; 
there should be an assert after rule application that this holds. At the very 
least this can be done by inserting a Project which renames columns to their 
original names.

This is similar to https://issues.apache.org/jira/browse/CALCITE-7058 
"Decorrelator may produce different column names". I think that this goal can 
even be achieved for the decorrelator itself.

Mihai


________________________________
From: Julian Hyde <jhyde.apa...@gmail.com>
Sent: Wednesday, June 25, 2025 7:59 AM
To: dev@calcite.apache.org <dev@calcite.apache.org>
Subject: Re: [DISCUSS] Preserving Output Alias Names After RelNode Optimization

Preserving column names through the optimization process, and many rewrite 
rules being applied, is very hard if not impossible.

Instead I would approach this as the RelToSqlConverter does, and try to produce 
the best/most concise/most human-readable SQL possible given a RelRoot. 
Flattening the subquery that is generated to project/rename the output columns 
of a Sort seems a reasonable thing to do for RelToSqlConverter to do. My 
approach would be to write a unit test and torture the code until it passes. :)

> On Jun 25, 2025, at 3:07 AM, Yanjing Wang <zhuangzixiao...@gmail.com> wrote:
>
> Hi all, I'd like to discuss a challenge regarding the preservation of
> column aliases after RelNode optimization in Apache Calcite. Let me outline
> the specific problem and potential approaches. Problem Statement:
> Currently, when applying RelNode optimization rules, Calcite doesn't
> preserve the original output column aliases. While using RelRoot seems like
> a potential solution, it introduces complications when the optimal RelNode
> is a Sort or similar node. Consider this scenario with a best rel: ` SELECT
> column1, ... FROM ... ORDER BY column1 DESC ` If we try to preserve aliases
> using RelRoot, it might generate SQL like: ` SELECT column1 AS alias1, ...
> FROM ( SELECT ... ORDER BY column1 DESC ) t1 ` This transformation can
> break the ORDER BY clause functionality in many compute engines. Questions
> for Discussion: 1. Is there an existing mature solution in Calcite for
> maintaining output alias consistency after optimization? 2. If not, what
> would be the recommended approach when dealing with Sort nodes (or similar
> RelNodes) that could be affected by RelRoot-based alias preservation? 3. If
> we implement a new solution, should we introduce additional RelOptRules to
> optimize the resulting RelNode structure? This would ensure we maintain
> both alias consistency and query performance.
> I'd appreciate your thoughts and suggestions on this matter, especially
> from those who have encountered similar challenges. Best regards, Yanjing
> Wang

Reply via email to