[ 
https://issues.apache.org/jira/browse/CALCITE-4998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Noble updated CALCITE-4998:
--------------------------------
    Description: 
I believe a bug was introduce in 
[4b34903|https://github.com/apache/calcite/commit/4b349032c17b95735b12593a65f7027d54d8b8b6].
 My understanding is not yet thorough enough to suggest an ideal fix, so here's 
documenting a problematic case.

The [only 
change|https://github.com/apache/calcite/blob/b4490fb64341ea900790d0f9f2e1043fe75cbdde/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L436-L440]
 in that commit is to special-case generating the SQL for a projection on top 
of a sort. Here's an example input to that function ({{Result visit(Project 
e)}}) that produces incorrect results:
{code:java}
LogicalProject(orders_created_at_month_name=[$1])
  LogicalSort(sort0=[$2], dir0=[DESC-nulls-last], fetch=[1])
    LogicalAggregate(group=[{2, 3}], orders_count=[COUNT()])
      LogicalProject(orders_order_amount=[$1], orders.status=[$4], 
orders.created_at_month_num=[toDate_DATE_MONTH_NUM($2)], 
orders_created_at_month_name=[toDate_DATE_MONTH_NAME($2)], 
__pin_1248672c=[lookerFirst(null:NULL, $0, $1, $2, $3, $4)])
        LogicalFilter(condition=[=($4, _UTF-16'cancelled':VARCHAR CHARACTER SET 
"UTF-16")])
          
ExplicitlyAliasedTableScan(table=[[$$looker_root_6b90a82f-2568-4445-8137-1fa7bbd293ec$$,
 orders]])
{code}
When we run {{visitInput(e, 0)}} (as in what the current code would do in this 
situation), we get the following value for {{x}} as a result, which is clearly 
incorrect:
{code:sql}
SELECT *
FROM `orders` AS `orders`
WHERE `status` = 'cancelled'
{code}
If, however, we run {{visitInput(e, 0, Clause.SELECT)}} (as in the prior 
behavior), we get the following for {{x}}, which is correct:
{code:sql}
SELECT (EXTRACT(MONTH FROM `created_at`)) AS `orders.created_at_month_num`, 
(CASE  
  WHEN EXTRACT(MONTH FROM `created_at`) = 1 THEN 'January'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 2 THEN 'February'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 3 THEN 'March'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 4 THEN 'April'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 5 THEN 'May'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 6 THEN 'June'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 7 THEN 'July'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 8 THEN 'August'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 9 THEN 'September'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 10 THEN 'October'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 11 THEN 'November'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 12 THEN 'December'  
END  
) AS `orders_created_at_month_name`, COUNT(*) AS `orders_count`  
FROM `orders` AS `orders`  
WHERE `status` = 'cancelled'  
GROUP BY 1, 2  
ORDER BY 3 DESC  
FETCH NEXT 1 ROWS ONLY
{code}

  was:
I believe a bug was introduce in 
[4b34903|https://github.com/apache/calcite/commit/4b349032c17b95735b12593a65f7027d54d8b8b6].
 My understanding is not yet thorough enough to suggest an ideal fix, so here's 
documenting a problematic case.

The [only 
change|https://github.com/apache/calcite/blob/b4490fb64341ea900790d0f9f2e1043fe75cbdde/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L436-L440]
 in that commit is to special-case generating the SQL for a projection on top 
of a sort. Here's an example input to that function that produces incorrect 
results:
{code:java}
LogicalProject(orders_created_at_month_name=[$1])
  LogicalSort(sort0=[$2], dir0=[DESC-nulls-last], fetch=[1])
    LogicalAggregate(group=[{2, 3}], orders_count=[COUNT()])
      LogicalProject(orders_order_amount=[$1], orders.status=[$4], 
orders.created_at_month_num=[toDate_DATE_MONTH_NUM($2)], 
orders_created_at_month_name=[toDate_DATE_MONTH_NAME($2)], 
__pin_1248672c=[lookerFirst(null:NULL, $0, $1, $2, $3, $4)])
        LogicalFilter(condition=[=($4, _UTF-16'cancelled':VARCHAR CHARACTER SET 
"UTF-16")])
          
ExplicitlyAliasedTableScan(table=[[$$looker_root_6b90a82f-2568-4445-8137-1fa7bbd293ec$$,
 orders]])
{code}
When we run {{visitInput(e, 0)}} (as in what the current code would do in this 
situation), we get the following value for {{x}} as a result, which is clearly 
incorrect:
{code:sql}
SELECT *
FROM `orders` AS `orders`
WHERE `status` = 'cancelled'
{code}
If, however, we run {{visitInput(e, 0, Clause.SELECT)}} (as in the prior 
behavior), we get the following for {{{}x, which is correct{}}}:
{code:sql}
SELECT (EXTRACT(MONTH FROM `created_at`)) AS `orders.created_at_month_num`, 
(CASE  
  WHEN EXTRACT(MONTH FROM `created_at`) = 1 THEN 'January'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 2 THEN 'February'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 3 THEN 'March'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 4 THEN 'April'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 5 THEN 'May'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 6 THEN 'June'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 7 THEN 'July'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 8 THEN 'August'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 9 THEN 'September'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 10 THEN 'October'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 11 THEN 'November'  
  WHEN EXTRACT(MONTH FROM `created_at`) = 12 THEN 'December'  
END  
) AS `orders_created_at_month_name`, COUNT(*) AS `orders_count`  
FROM `orders` AS `orders`  
WHERE `status` = 'cancelled'  
GROUP BY 1, 2  
ORDER BY 3 DESC  
FETCH NEXT 1 ROWS ONLY
{code}


> Subquery Regression
> -------------------
>
>                 Key: CALCITE-4998
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4998
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Will Noble
>            Priority: Major
>
> I believe a bug was introduce in 
> [4b34903|https://github.com/apache/calcite/commit/4b349032c17b95735b12593a65f7027d54d8b8b6].
>  My understanding is not yet thorough enough to suggest an ideal fix, so 
> here's documenting a problematic case.
> The [only 
> change|https://github.com/apache/calcite/blob/b4490fb64341ea900790d0f9f2e1043fe75cbdde/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L436-L440]
>  in that commit is to special-case generating the SQL for a projection on top 
> of a sort. Here's an example input to that function ({{Result visit(Project 
> e)}}) that produces incorrect results:
> {code:java}
> LogicalProject(orders_created_at_month_name=[$1])
>   LogicalSort(sort0=[$2], dir0=[DESC-nulls-last], fetch=[1])
>     LogicalAggregate(group=[{2, 3}], orders_count=[COUNT()])
>       LogicalProject(orders_order_amount=[$1], orders.status=[$4], 
> orders.created_at_month_num=[toDate_DATE_MONTH_NUM($2)], 
> orders_created_at_month_name=[toDate_DATE_MONTH_NAME($2)], 
> __pin_1248672c=[lookerFirst(null:NULL, $0, $1, $2, $3, $4)])
>         LogicalFilter(condition=[=($4, _UTF-16'cancelled':VARCHAR CHARACTER 
> SET "UTF-16")])
>           
> ExplicitlyAliasedTableScan(table=[[$$looker_root_6b90a82f-2568-4445-8137-1fa7bbd293ec$$,
>  orders]])
> {code}
> When we run {{visitInput(e, 0)}} (as in what the current code would do in 
> this situation), we get the following value for {{x}} as a result, which is 
> clearly incorrect:
> {code:sql}
> SELECT *
> FROM `orders` AS `orders`
> WHERE `status` = 'cancelled'
> {code}
> If, however, we run {{visitInput(e, 0, Clause.SELECT)}} (as in the prior 
> behavior), we get the following for {{x}}, which is correct:
> {code:sql}
> SELECT (EXTRACT(MONTH FROM `created_at`)) AS `orders.created_at_month_num`, 
> (CASE  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 1 THEN 'January'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 2 THEN 'February'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 3 THEN 'March'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 4 THEN 'April'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 5 THEN 'May'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 6 THEN 'June'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 7 THEN 'July'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 8 THEN 'August'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 9 THEN 'September'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 10 THEN 'October'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 11 THEN 'November'  
>   WHEN EXTRACT(MONTH FROM `created_at`) = 12 THEN 'December'  
> END  
> ) AS `orders_created_at_month_name`, COUNT(*) AS `orders_count`  
> FROM `orders` AS `orders`  
> WHERE `status` = 'cancelled'  
> GROUP BY 1, 2  
> ORDER BY 3 DESC  
> FETCH NEXT 1 ROWS ONLY
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to