[ 
https://issues.apache.org/jira/browse/CALCITE-5406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080191#comment-18080191
 ] 

Yu Xu commented on CALCITE-5406:
--------------------------------

I move https://issues.apache.org/jira/browse/CALCITE-7517 here and currently 
there is a pr to trace this jira:https://github.com/apache/calcite/pull/4933

Let me answer the previous question first:

At the RelNode algebraic level, *DISTINCT ON* is syntactic sugar without 
introducing any new RelNode operators; however, at the SqlNode level, it is not 
pure syntactic sugar because it requires the Parser to support the new syntax, 
the Validator to support the new validation rules, and then desugaring in 
SqlToRelConverter to a combination of existing operators.
{code:java}
SqlSelect[DISTINCT ON (deptno) empno, ename ORDER BY deptno, empno]
    ↓  SqlToRelConverter
LogicalProject(EMPNO=[$0], ENAME=[$1])                    -- ③ remove 
ROW_NUMBER 
  LogicalFilter(condition=[=($3, 1)])                     -- ② only retain 
first row in a group
    LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$2],
      $f3=[ROW_NUMBER() OVER (PARTITION BY $2
                               ORDER BY $2 NULLS LAST, $0 NULLS LAST)])
      LogicalSort(sort0=[$2], sort1=[$0])                  -- ① sort first
        LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7])
          LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
There are no new DistinctOn operator types. Therefore, from an algebraic 
perspective, this is purely syntactic sugar.

The introduction of the DISTINCT ON syntax offers certain advantages over the 
previous ANY_VALUE, and I'd like to add my thoughts here. 
https://issues.apache.org/jira/browse/CALCITE-2366

> Support the SELECT DISTINCT ON statement for PostgreSQL dialect
> ---------------------------------------------------------------
>
>                 Key: CALCITE-5406
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5406
>             Project: Calcite
>          Issue Type: New Feature
>    Affects Versions: 1.32.0
>            Reporter: Zhengqiang Duan
>            Assignee: Zhengqiang Duan
>            Priority: Major
>
> Hi, community, I am currently using Calcite to help ShardingSphere implement 
> distributed database features. When we tested the SELECT DISTINCT ON 
> statement in PostgreSQL(as shown in the following sql), we found that Calcite 
> currently does not support the parsing and semantic processing of the SELECT 
> DISTINCT ON statement. Considering that the SELECT DISTINCT ON statement in 
> PostgreSQL is frequently used and valuable to users, we hope to support this 
> new feature.
> {code:java}
> select distinct on (user_id) * from t_order order by user_id, order_id; {code}
> As suggested by [~asolimando]  and [~julianhyde] , first, we need to support 
> SELECT DISTINCT ON statement parsing at PostgreSQL and lenient compliance 
> level.
> Secondly, we need to investigate whether we need to rewrite in 
> SqlToRelConverter or in planner rules? 
> I will investigate the solution to support this new feature and update it in 
> jira later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to