[ 
https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-5767:
---------------------------------
    Description: 
JDBC adapter for MSSQL adds a call to the {{GROUPING}} function to the {{ORDER 
BY}} clause twice when emulating {{NULLS LAST}}. This is a problem because 
MSSQL disallows duplicate sort keys; it is caused because the MSSQL dialect 
wrongly gives the {{GROUPING}} function special treatment when emulating NULL 
direction.

{{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} 
calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
 This seems to be an optimization attempt since {{GROUPING}} is known to never 
return {{{}NULL{}}}, and therefore needs no null direction emulation. However, 
this causes problems because:

1. MSSQL does not have the same default null direction as Calcite's 
{{RelBuilder}} (and most other dialects).
2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
3. MSSQL does not allow sorting on the same field twice, even though there is 
no theoretical issue with this.

Each of these properties must be present for the problem to occur, so it's a 
bit niche and specific to MSSQL. Seems like the best solution is to simply 
eliminate the special-case treatment for {{GROUPING}} in 
{{{}emulateNullDirection{}}}, which is currently creating problems due to 
property #3.

{*}More in-depth explanation{*}:

In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert 
rex nodes as sorting expressions, but this is only the default null direction 
for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has 
special-case logic for emulating null direction of GROUPING calls, whereby it 
effectively duplicates the expression. Really, 
{{MssqlSqlDialect.emulateNullDirection}} probably should've been returning 
{{null}} instead, signalling to callers that no null-direction emulation is 
necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes 
another problem due to property #2 above when the null direction is non-default 
as is caused simply by using {{RelBuilder.collation}} as described above (it 
should be noted that this method takes rex nodes instead of 
{{RelFieldCollation}} object, so there is no way to specify null direction) 
because the non-default null direction is not expanded into a {{CASE}} 
expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).

Here's a test illustrating the problem:

Input SQL (default dialect)
{code:xml}
select "product_class_id", "brand_name", GROUPING("brand_name")
from "product"
group by GROUPING SETS (("product_class_id", "brand_name"), 
("product_class_id"))
order by 3, 2, 1
{code}
Current behavior for unparsing as MSSQL (incorrect because it orders by the 
same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if 
you try to actually run this against a real MSSQL database, even though it 
seems like it shouldn't):
{code:xml}
SELECT [product_class_id], [brand_name], GROUPING([brand_name])
FROM [foodmart].[product]
GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
ORDER BY
         GROUPING([brand_name]),
         3,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}
Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} 
for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
{code:xml}
...
ORDER BY
         3 NULLS LAST,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}
Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively 
ordering by a constant, this will at least run and produce the correct results):
{code:xml}
...
ORDER BY
         CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
         3,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}

  was:
Calcite's MSSQL dialect should not give GROUPING special treatment when 
emulating NULL direction.

{{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} 
calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
 This seems to be an optimization attempt since {{GROUPING}} is known to never 
return {{{}NULL{}}}, and therefore needs no null direction emulation. However, 
this causes problems because:

1. MSSQL does not have the same default null direction as Calcite's 
{{RelBuilder}} (and most other dialects).
2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
3. MSSQL does not allow sorting on the same field twice, even though there is 
no theoretical issue with this.

Each of these properties must be present for the problem to occur, so it's a 
bit niche and specific to MSSQL. Seems like the best solution is to simply 
eliminate the special-case treatment for {{GROUPING}} in 
{{{}emulateNullDirection{}}}, which is currently creating problems due to 
property #3.

{*}More in-depth explanation{*}:

In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert 
rex nodes as sorting expressions, but this is only the default null direction 
for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has 
special-case logic for emulating null direction of GROUPING calls, whereby it 
effectively duplicates the expression. Really, 
{{MssqlSqlDialect.emulateNullDirection}} probably should've been returning 
{{null}} instead, signalling to callers that no null-direction emulation is 
necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes 
another problem due to property #2 above when the null direction is non-default 
as is caused simply by using {{RelBuilder.collation}} as described above (it 
should be noted that this method takes rex nodes instead of 
{{RelFieldCollation}} object, so there is no way to specify null direction) 
because the non-default null direction is not expanded into a {{CASE}} 
expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).

Here's a test illustrating the problem:

Input SQL (default dialect)
{code:xml}
select "product_class_id", "brand_name", GROUPING("brand_name")
from "product"
group by GROUPING SETS (("product_class_id", "brand_name"), 
("product_class_id"))
order by 3, 2, 1
{code}
Current behavior for unparsing as MSSQL (incorrect because it orders by the 
same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if 
you try to actually run this against a real MSSQL database, even though it 
seems like it shouldn't):
{code:xml}
SELECT [product_class_id], [brand_name], GROUPING([brand_name])
FROM [foodmart].[product]
GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
ORDER BY
         GROUPING([brand_name]),
         3,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}
Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} 
for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
{code:xml}
...
ORDER BY
         3 NULLS LAST,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}
Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively 
ordering by a constant, this will at least run and produce the correct results):
{code:xml}
...
ORDER BY
         CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
         3,
         CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
         [brand_name],
         CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
         [product_class_id]
{code}


> JDBC adapter for MSSQL adds GROUPING to ORDER BY clause twice when emulating 
> NULLS LAST
> ---------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5767
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5767
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Assignee: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> JDBC adapter for MSSQL adds a call to the {{GROUPING}} function to the 
> {{ORDER BY}} clause twice when emulating {{NULLS LAST}}. This is a problem 
> because MSSQL disallows duplicate sort keys; it is caused because the MSSQL 
> dialect wrongly gives the {{GROUPING}} function special treatment when 
> emulating NULL direction.
> {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} 
> calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
>  This seems to be an optimization attempt since {{GROUPING}} is known to 
> never return {{{}NULL{}}}, and therefore needs no null direction emulation. 
> However, this causes problems because:
> 1. MSSQL does not have the same default null direction as Calcite's 
> {{RelBuilder}} (and most other dialects).
> 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
> 3. MSSQL does not allow sorting on the same field twice, even though there is 
> no theoretical issue with this.
> Each of these properties must be present for the problem to occur, so it's a 
> bit niche and specific to MSSQL. Seems like the best solution is to simply 
> eliminate the special-case treatment for {{GROUPING}} in 
> {{{}emulateNullDirection{}}}, which is currently creating problems due to 
> property #3.
> {*}More in-depth explanation{*}:
> In {{{}RelBuilder.collation{}}}, we use the "default null direction" to 
> insert rex nodes as sorting expressions, but this is only the default null 
> direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem 
> because MSSQL has special-case logic for emulating null direction of GROUPING 
> calls, whereby it effectively duplicates the expression. Really, 
> {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning 
> {{null}} instead, signalling to callers that no null-direction emulation is 
> necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes 
> another problem due to property #2 above when the null direction is 
> non-default as is caused simply by using {{RelBuilder.collation}} as 
> described above (it should be noted that this method takes rex nodes instead 
> of {{RelFieldCollation}} object, so there is no way to specify null 
> direction) because the non-default null direction is not expanded into a 
> {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} 
> syntax).
> Here's a test illustrating the problem:
> Input SQL (default dialect)
> {code:xml}
> select "product_class_id", "brand_name", GROUPING("brand_name")
> from "product"
> group by GROUPING SETS (("product_class_id", "brand_name"), 
> ("product_class_id"))
> order by 3, 2, 1
> {code}
> Current behavior for unparsing as MSSQL (incorrect because it orders by the 
> same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail 
> if you try to actually run this against a real MSSQL database, even though it 
> seems like it shouldn't):
> {code:xml}
> SELECT [product_class_id], [brand_name], GROUPING([brand_name])
> FROM [foodmart].[product]
> GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
> ORDER BY
>          GROUPING([brand_name]),
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns 
> {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS 
> LAST}} syntax):
> {code:xml}
> ...
> ORDER BY
>          3 NULLS LAST,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}
> Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is 
> effectively ordering by a constant, this will at least run and produce the 
> correct results):
> {code:xml}
> ...
> ORDER BY
>          CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
>          3,
>          CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
>          [brand_name],
>          CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
>          [product_class_id]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to