[
https://issues.apache.org/jira/browse/CALCITE-7279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046881#comment-18046881
]
krooswu commented on CALCITE-7279:
----------------------------------
*Analysis & Implementation Update:*
I have further investigated the compatibility issue between Calcite's generated
SQL and ClickHouse's identifier resolution logic.
*Root Cause:* ClickHouse has a restrictive scoping model for subqueries. When a
JOIN is wrapped in a subquery with an alias (e.g., {{{}(SELECT d1.deptno,
d2.loc FROM ...) AS j{}}}), the internal table qualifiers {{d1}} and {{d2}} are
stripped from the perspective of the outer query. However, unless the columns
are explicitly renamed using {{{}AS{}}}, ClickHouse often fails to map the
internal {{d2.loc}} to the external {{{}j.loc{}}}, leading to an
{{UNKNOWN_IDENTIFIER}} error.
*The Fix:* I am updating the implementation to use an *Explicit Aliasing*
strategy within the {{wrapNestedJoin}} logic. Instead of a simple wrapper, the
converter now ensures that every projected column in the nested join subquery
is explicitly aliased:
* *From:* {{(SELECT d1.deptno, d2.loc FROM ...) AS j}}
* *To:* {{(SELECT d1.deptno AS deptno, d2.loc AS loc FROM ...) AS j}}
*Key Benefits:*
# {*}Scoping Safety{*}: Guarantees that the outer query can always resolve
columns via the subquery alias {{{}j{}}}.
# {*}Performance{*}: Avoids redundant double-nesting (e.g., {{{}SELECT * FROM
(SELECT...){}}}) by flattening the projection into a single-layer subquery.
# {*}Dialect Integrity{*}: This behavior is isolated to the ClickHouse
dialect, ensuring no regressions or SQL bloat for standard dialects like MySQL
or PostgreSQL.
I will be updating the PR with comprehensive test cases covering aggregations,
filters, and multi-way joins to verify this behavior.
> [ClickHouse] Dialect does not generate required subquery wrappers for nested
> JOINs for ClickHouse
> -------------------------------------------------------------------------------------------------
>
> Key: CALCITE-7279
> URL: https://issues.apache.org/jira/browse/CALCITE-7279
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.40.0
> Environment: *ClickHouse version:*
> 25.10.1.3832
> Reporter: Nikita Ilin
> Assignee: krooswu
> Priority: Major
> Labels: pull-request-available
>
> *Problem Description*
> When generating SQL for ClickHouse using {{{}ClickHouseSqlDialect{}}}, the
> {{RelToSqlConverter}} produces syntactically invalid SQL if the underlying
> {{RelNode}} plan contains nested JOINs.
> ClickHouse mandates that if an operand of a JOIN (especially the RHS) is
> itself a JOIN operation (including CROSS JOINs), it must be wrapped in an
> aliased subquery (e.g., {{{}SELECT * FROM (...) AS alias{}}}). Calcite
> currently does not generate this wrapper for the ClickHouse dialect.
> *Conceptual RelNode Structure:*
> {code:java}
> LogicalJoin(condition=[...], joinType=[left])
> LogicalTableScan(table=[FactTable])
> LogicalJoin(condition=[true], joinType=[inner]) <-- Nested JOIN on RHS
> LogicalAggregate(...)
> LogicalAggregate(...){code}
> *Actual Result (Invalid ClickHouse SQL):*
> {code:sql}
> SELECT ...
> FROM `FactTable`
> LEFT JOIN (
> (SELECT DISTINCT `x` FROM `Dim1`) AS `t0`
> CROSS JOIN
> (SELECT DISTINCT `y` FROM `Dim2`) AS `t1`
> ) ON ...{code}
>
> *Expected Result (Valid ClickHouse SQL):*
> {code:sql}
> SELECT ...
> FROM `FactTable`
> LEFT JOIN (
> -- Wrapper needed for ClickHouse compatibility
> SELECT * FROM (
> (SELECT DISTINCT `x` FROM `Dim1`) AS `t0`
> CROSS JOIN
> (SELECT DISTINCT `y` FROM `Dim2`) AS `t1`
> ) AS nested_alias
> ) AS `t_wrapped` ON ...{code}
>
> *Impact and Workarounds*
> This issue forces users to implement complex workarounds, such as:
> # Fragile regex post-processing of the SQL string.
> # Complex {{RelNode}} rewriting (e.g., injecting opaque UDFs via
> {{RelShuttle}} or identity aggregates) to force the generation of a subquery
> boundary.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)