[ 
https://issues.apache.org/jira/browse/SPARK-49555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898739#comment-17898739
 ] 

Dongjoon Hyun commented on SPARK-49555:
---------------------------------------

I added a label, `releasenotes`.

> SQL Pipe Syntax
> ---------------
>
>                 Key: SPARK-49555
>                 URL: https://issues.apache.org/jira/browse/SPARK-49555
>             Project: Spark
>          Issue Type: Umbrella
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Daniel
>            Priority: Major
>              Labels: releasenotes
>
> This umbrella Jira ticket tracks implementing new support for issuing SQL 
> queries using pipe syntax.
> The objective is to make it easy to compose queries by specifying a sequence 
> of SQL clauses separated by the pipe token |> wherein each operator 
> represents a fully-defined transformation of the preceding relation. Each 
> pipe operator may refer to the names and rows generated by the preceding pipe 
> operator only; otherwise, each step is stateless.
>  * Research paper: 
> [https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/]
>  * Open-source ZetaSQL implementation: 
> [https://github.com/google/zetasql/blob/master/docs/pipe-syntax.md]
>  * Spark prototype: https://github.com/apache/spark/pull/47837
>  
> For example, here's query 13 from TPC-H:
>  
> SELECT c_count, COUNT( * ) AS custdist FROM
>   (SELECT c_custkey, COUNT(o_orderkey) c_count FROM customer
>   LEFT OUTER JOIN orders ON c_custkey = o_custkey
>   AND o_comment NOT LIKE '%unusual%packages%' GROUP BY c_custkey) AS c_orders
> GROUP BY c_count
> ORDER BY custdist DESC, c_count DESC;
>  
> With the new syntax, it becomes:
>  
> FROM customer
>  |> LEFT OUTER JOIN orders ON c_custkey = o_custkey
>     AND o_comment NOT LIKE '%unusual%packages%'
>  |> AGGREGATE COUNT(o_orderkey) c_count
>     GROUP BY c_custkey
>  |> AGGREGATE COUNT( * ) AS custdist
>     GROUP BY c_count
>  |> ORDER BY custdist DESC, c_count DESC;



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to