[
https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julian Hyde updated CALCITE-5767:
---------------------------------
Description:
Calcite's MSSQL dialect should not give GROUPING special treatment when
emulating NULL direction.
{{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}}
calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
This seems to be an optimization attempt since {{GROUPING}} is known to never
return {{{}NULL{}}}, and therefore needs no null direction emulation. However,
this causes problems because:
1. MSSQL does not have the same default null direction as Calcite's
{{RelBuilder}} (and most other dialects).
2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
3. MSSQL does not allow sorting on the same field twice, even though there is
no theoretical issue with this.
Each of these properties must be present for the problem to occur, so it's a
bit niche and specific to MSSQL. Seems like the best solution is to simply
eliminate the special-case treatment for {{GROUPING}} in
{{{}emulateNullDirection{}}}, which is currently creating problems due to
property #3.
{*}More in-depth explanation{*}:
In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert
rex nodes as sorting expressions, but this is only the default null direction
for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has
special-case logic for emulating null direction of GROUPING calls, whereby it
effectively duplicates the expression. Really,
{{MssqlSqlDialect.emulateNullDirection}} probably should've been returning
{{null}} instead, signalling to callers that no null-direction emulation is
necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes
another problem due to property #2 above when the null direction is non-default
as is caused simply by using {{RelBuilder.collation}} as described above (it
should be noted that this method takes rex nodes instead of
{{RelFieldCollation}} object, so there is no way to specify null direction)
because the non-default null direction is not expanded into a {{CASE}}
expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).
Here's a test illustrating the problem:
Input SQL (default dialect)
{code:xml}
select "product_class_id", "brand_name", GROUPING("brand_name")
from "product"
group by GROUPING SETS (("product_class_id", "brand_name"),
("product_class_id"))
order by 3, 2, 1
{code}
Current behavior for unparsing as MSSQL (incorrect because it orders by the
same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if
you try to actually run this against a real MSSQL database, even though it
seems like it shouldn't):
{code:xml}
SELECT [product_class_id], [brand_name], GROUPING([brand_name])
FROM [foodmart].[product]
GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
ORDER BY
GROUPING([brand_name]),
3,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}}
for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
{code:xml}
...
ORDER BY
3 NULLS LAST,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively
ordering by a constant, this will at least run and produce the correct results):
{code:xml}
...
ORDER BY
CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
3,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
was:
{{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}}
calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
This seems to be an optimization attempt since {{GROUPING}} is known to never
return {{{}NULL{}}}, and therefore needs no null direction emulation. However,
this causes problems because:
1. MSSQL does not have the same default null direction as Calcite's
{{RelBuilder}} (and most other dialects).
2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
3. MSSQL does not allow sorting on the same field twice, even though there is
no theoretical issue with this.
Each of these properties must be present for the problem to occur, so it's a
bit niche and specific to MSSQL. Seems like the best solution is to simply
eliminate the special-case treatment for {{GROUPING}} in
{{{}emulateNullDirection{}}}, which is currently creating problems due to
property #3.
{*}More in-depth explanation{*}:
In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert
rex nodes as sorting expressions, but this is only the default null direction
for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has
special-case logic for emulating null direction of GROUPING calls, whereby it
effectively duplicates the expression. Really,
{{MssqlSqlDialect.emulateNullDirection}} probably should've been returning
{{null}} instead, signalling to callers that no null-direction emulation is
necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes
another problem due to property #2 above when the null direction is non-default
as is caused simply by using {{RelBuilder.collation}} as described above (it
should be noted that this method takes rex nodes instead of
{{RelFieldCollation}} object, so there is no way to specify null direction)
because the non-default null direction is not expanded into a {{CASE}}
expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax).
Here's a test illustrating the problem:
Input SQL (default dialect)
{code:xml}
select "product_class_id", "brand_name", GROUPING("brand_name")
from "product"
group by GROUPING SETS (("product_class_id", "brand_name"),
("product_class_id"))
order by 3, 2, 1
{code}
Current behavior for unparsing as MSSQL (incorrect because it orders by the
same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if
you try to actually run this against a real MSSQL database, even though it
seems like it shouldn't):
{code:xml}
SELECT [product_class_id], [brand_name], GROUPING([brand_name])
FROM [foodmart].[product]
GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
ORDER BY
GROUPING([brand_name]),
3,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}}
for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax):
{code:xml}
...
ORDER BY
3 NULLS LAST,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively
ordering by a constant, this will at least run and produce the correct results):
{code:xml}
...
ORDER BY
CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
3,
CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
[brand_name],
CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
[product_class_id]
{code}
> JDBC adapter for MSSQL adds GROUPING to ORDER BY clause twice when emulating
> NULLS LAST
> ---------------------------------------------------------------------------------------
>
> Key: CALCITE-5767
> URL: https://issues.apache.org/jira/browse/CALCITE-5767
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: Will Noble
> Assignee: Will Noble
> Priority: Minor
> Labels: pull-request-available
>
> Calcite's MSSQL dialect should not give GROUPING special treatment when
> emulating NULL direction.
> {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}}
> calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95].
> This seems to be an optimization attempt since {{GROUPING}} is known to
> never return {{{}NULL{}}}, and therefore needs no null direction emulation.
> However, this causes problems because:
> 1. MSSQL does not have the same default null direction as Calcite's
> {{RelBuilder}} (and most other dialects).
> 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax.
> 3. MSSQL does not allow sorting on the same field twice, even though there is
> no theoretical issue with this.
> Each of these properties must be present for the problem to occur, so it's a
> bit niche and specific to MSSQL. Seems like the best solution is to simply
> eliminate the special-case treatment for {{GROUPING}} in
> {{{}emulateNullDirection{}}}, which is currently creating problems due to
> property #3.
> {*}More in-depth explanation{*}:
> In {{{}RelBuilder.collation{}}}, we use the "default null direction" to
> insert rex nodes as sorting expressions, but this is only the default null
> direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem
> because MSSQL has special-case logic for emulating null direction of GROUPING
> calls, whereby it effectively duplicates the expression. Really,
> {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning
> {{null}} instead, signalling to callers that no null-direction emulation is
> necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes
> another problem due to property #2 above when the null direction is
> non-default as is caused simply by using {{RelBuilder.collation}} as
> described above (it should be noted that this method takes rex nodes instead
> of {{RelFieldCollation}} object, so there is no way to specify null
> direction) because the non-default null direction is not expanded into a
> {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}}
> syntax).
> Here's a test illustrating the problem:
> Input SQL (default dialect)
> {code:xml}
> select "product_class_id", "brand_name", GROUPING("brand_name")
> from "product"
> group by GROUPING SETS (("product_class_id", "brand_name"),
> ("product_class_id"))
> order by 3, 2, 1
> {code}
> Current behavior for unparsing as MSSQL (incorrect because it orders by the
> same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail
> if you try to actually run this against a real MSSQL database, even though it
> seems like it shouldn't):
> {code:xml}
> SELECT [product_class_id], [brand_name], GROUPING([brand_name])
> FROM [foodmart].[product]
> GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id])
> ORDER BY
> GROUPING([brand_name]),
> 3,
> CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
> [brand_name],
> CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
> [product_class_id]
> {code}
> Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns
> {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS
> LAST}} syntax):
> {code:xml}
> ...
> ORDER BY
> 3 NULLS LAST,
> CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
> [brand_name],
> CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
> [product_class_id]
> {code}
> Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is
> effectively ordering by a constant, this will at least run and produce the
> correct results):
> {code:xml}
> ...
> ORDER BY
> CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END,
> 3,
> CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END,
> [brand_name],
> CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END,
> [product_class_id]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)