[
https://issues.apache.org/jira/browse/CALCITE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994006#comment-16994006
]
Julian Hyde commented on CALCITE-3593:
--------------------------------------
Calcite does know about this: see
[SqlConformance.isHavingAlias()|https://calcite.apache.org/apidocs/org/apache/calcite/sql/validate/SqlConformance.html#isHavingAlias()].
But I don't think we should try to be too clever when we generate SQL, because
we might be mistaken about the semantics of the target dialect. If there is
ambiguity (a column has the same name as an alias) then we should generate
something that produces the same result whatever the value of {{isHavingAlias}}.
> RelToSqlConverter changes target of ambiguous HAVING clause with a Project on
> Filter on Aggregate
> -------------------------------------------------------------------------------------------------
>
> Key: CALCITE-3593
> URL: https://issues.apache.org/jira/browse/CALCITE-3593
> Project: Calcite
> Issue Type: Bug
> Reporter: Steven Talbot
> Priority: Major
>
> Best shown with the shell of a test in RelToSqlConverter test.
> The following SQL on BigQuery
> {code:java}
> select product_id - 1000 as product_id
> from (
> select product_id, avg(gross_weight) as agw
> from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as
> product
> where net_weight < 100
> group by product_id having product_id > 0){code}
> produces one result, because the having filter applies to the product id
> before subtraction, of course.
> Running it through the machinery in that test
> (`sql(query).withBigQuery().ok(expected)`) translates it to:
> {noformat}
> SELECT product_id - 1000 AS product_id
> from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as
> product
> WHERE net_weight < 100
> GROUP BY product_id
> HAVING product_id > 0{noformat}
> This changes the meaning of the query: now the HAVING is on the
> after-subtraction product_id and you get no results, rather than the one
> result.
> Note that this is _not_ true in HyperSQL, as it has different semantics
> around the HAVING namespace.
> {noformat}
> select "product_id" - 1000 as "product_id"
> from (
> select "product_id", avg("gross_weight") as agw
> from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight"
> FROM (VALUES(0))) as product
> where "net_weight" < 100
> group by "product_id" having "product_id" > 0){noformat}
> becomes
> {noformat}
> SELECT "product_id" - 1000 AS "product_id"
> from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight"
> FROM (VALUES(0))) as product
> WHERE "net_weight" < 100
> GROUP BY "product_id"
> HAVING "product_id" > 0{noformat}
> But the meaning is preserved: both return a row.
> I'm not enough of a SQL standards expert to know which one is being more
> compliant, but presumably both would have to be supported via some sort of
> flag?
> I think the fix would be to force the subselect on dialects such as BigQuery
> that have this behavior. Probably something that looks a lot like
> [https://github.com/apache/calcite/blob/3530daaa8cad43aad6845b6c79e4bc1ca0e72f5f/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1043-L1047]
> The test, of course, looks like pretty silly SQL no one would ever write, but
> the point is this is what's generated when you have
> {noformat}
> Project f(x) as x
> Filter g(x)
> Aggregate {<x>}, ...{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)