[ 
https://issues.apache.org/jira/browse/CALCITE-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036478#comment-18036478
 ] 

Caican Cai commented on CALCITE-7028:
-------------------------------------

[~julianhyde] I'm about to begin this work. I might need to research how 
Calcite is best suited for SQL pipe syntax, as I've never really been familiar 
with Calcite's parsing module.

> Implement SQL pipe syntax 
> --------------------------
>
>                 Key: CALCITE-7028
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7028
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Priority: Major
>
> Implement SQL pipe syntax, which was introduced as part of GoogleSQL, and 
> described in a paper "[SQL Has Problems. We Can Fix Them: Pipe Syntax In 
> SQL|https://www.vldb.org/pvldb/vol17/p4051-shute.pdf]"; by Jeff Shute and 
> others. Other vendors, including 
> [Firebolt|https://docs.firebolt.io/sql_reference/commands/queries/pipe.html] 
> and 
> [Databricks|https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-qry-pipeline],
>  have announced support.
> The paper gives as an example a query in conventional syntax
> {code:java}
> SELECT c_count, COUNT(*) AS custdist
> FROM
>   ( SELECT c_custkey, COUNT(o_orderkey) c_count
>     FROM customer
>     LEFT OUTER JOIN orders ON c_custkey = o_custkey
>       AND o_comment NOT LIKE '%unusual%packages%'
>     GROUP BY c_custkey
>   ) AS c_orders
> GROUP BY c_count
> ORDER BY custdist DESC, c_count DESC;
> {code}
> and its equivalent with pipe syntax:
> {code:java}
> FROM customer
> |> LEFT OUTER JOIN orders ON c_custkey = o_custkey
>              AND o_comment NOT LIKE '%unusual%packages%'
> |> AGGREGATE COUNT(o_orderkey) c_count
>    GROUP BY c_custkey
> |> AGGREGATE COUNT(*) AS custdist
>    GROUP BY c_count
> |> ORDER BY custdist DESC, c_count DESC;
> {code}
> There are several arguments for supporting pipe syntax in Calcite. First, it 
> allows Calcite to better emulate the GoogleSQL (BigQuery), Databricks and 
> Firebolt dialects. Second, it is an attractive and intuitive syntax that 
> allows users to get queries working with fewer mistakes.
> Pipes should be fairly easy to implement. The designers made sure that it was 
> basically syntactic sugar: not adding new relational operators, keeping the 
> existing expression syntax and type system, and (as far as I can tell) 
> keeping the namespace scoping rules (for example in which clauses a table 
> alias remains visible).
> The changes in Calcite would be:
>  * Additions to the parser. Allow a sequence of various clauses (e.g. {{{}|> 
> WHERE condition{}}}) to follow a query, and give each of them an AST node. 
> Allow a query to begin with {{{}FROM{}}}.
>  * Add validation rules to the validator to deduce the type of all 
> expressions and to raise errors when a query is not semantically valid. This 
> should be fairly straightforward because the
>  * In {{{}SqlToRelConverter{}}}, generate the same {{RelNode}} relational 
> algebra as the corresponding old-style SQL syntax.
>  * I believe that each of the pipe steps has a corresponding method in 
> {{{}RelBuilder{}}}. In most but not all cases, {{SqlToRelConverter}} is 
> currently calling that method. In those cases, this task must either change 
> {{SqlToRelConverter}} to use the {{RelBuilder}} method or justify why not.
> Use the ZetaSQL tests as a guideline for what needs to be supported. (Please 
> research and add a link to this case.)
> Incremental rollout would be desirable. The first iteration could, for 
> example, support just leading {{FROM}} and the {{|> WHERE}} clause. But it 
> must include tests for parsing and validation and tests that execute queries.
> At first, the syntax would be enabled only in particular dialects, and 
> disabled in the default Calcite dialect. But I think the changes should be 
> made to the core parser (not the babel parser). We may enable pipe syntax by 
> default, later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to