[ 
https://issues.apache.org/jira/browse/CALCITE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5775:
------------------------------------
    Labels: pull-request-available  (was: )

> Null direction emulation broken for complex expressions on some dialects
> ------------------------------------------------------------------------
>
>                 Key: CALCITE-5775
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5775
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> This is a problem in BigQuery, and may be a problem in other dialects as 
> well. Consider the following piece of valid BQ SQL:
> {code:sql}
> SELECT REPEAT(first_name, 2),
>        COUNT(id)
> FROM looker_test.users
> GROUP BY REPEAT(first_name, 2)
> ORDER BY 1
> {code}
> Now consider a version where the {{ORDER BY}} clause is changed to this:
> {code:sql}
> ORDER BY REPEAT(first_name, 2)
> {code}
> This is logically the same query, because the expression in the {{ORDER}} 
> clause is the same as the one in the {{SELECT}} / {{GROUP}} clauses. BigQuery 
> is sophisticated enough to match the select to the group expression in both 
> queries, but cannot match either with the order expression. It gives this 
> error: _ORDER BY clause expression references column first_name which is 
> neither grouped nor aggregated_.
> So, when sorting by complex expressions in BQ, Calcite relies on either:
> * No null direction emulation required. 
> * sorting by alias or ordinal, which is a problem with current null direction 
> emulation because it adds an extra complex sort expression, or
> * having a query that just happens to also have the underlying field in the 
> {{GROUP BY}} clause by itself, which seems to actually happen pretty often in 
> my testing, but obviously shouldn't be a constraint.
> As I wrote that, I realized this may be easily fixable for BQ since it added 
> support for {{NULLS FIRST}} / {{LAST}} in 2020 and it seems Calcite has not 
> caught up yet. Consider this rel node:
> {code}
> LogicalSort(sort0=[$0], dir0=[ASC])
>   LogicalAggregate(group=[{0}], cent=[COUNT($1)])
>     LogicalProject($f0=[CASE(IS NULL($4), 0, 1)], MGR=[$3])
>       JdbcTableScan(table=[[JDBC_SCOTT, EMP]])
> {code}
> Calcite would convert it to this in BigQuery due to null direction emulation:
> {code:sql}
> SELECT CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END AS `$f0`, COUNT(MGR) AS 
> cent
> FROM SCOTT.EMP
> GROUP BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END
> ORDER BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END IS NULL, 1
> {code}
> Which of course triggers the problem described above. This may be a problem 
> for MSSQL as well since it doesn't support {{NULLS LAST}}. The fix for BQ, at 
> least, may be to just support {{NULLS LAST}} and sort by ordinal.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to