dtenedor commented on code in PR #48649:
URL: https://github.com/apache/spark/pull/48649#discussion_r1833529361


##########
sql/api/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4:
##########
@@ -1502,8 +1502,11 @@ version
     ;
 
 operatorPipeRightSide
-    : selectClause
-    | whereClause
+    : selectClause windowClause?
+    // Note that the WINDOW clause is not allowed in the WHERE pipe operator, 
but we add it here in
+    // the grammar simply for purposes of catching this invalid syntax and 
throwing a specific
+    // dedicated error message.
+    | whereClause windowClause?

Review Comment:
   This is correct, this changes maintains the independent composability of the 
operators, and clarifies to which step the WINDOW clause applies.



##########
sql/core/src/test/resources/sql-tests/inputs/pipe-operators.sql:
##########
@@ -821,6 +819,84 @@ select 1 x, 2 y, 3 z
 table other
 |> aggregate b group by a;
 
+-- WINDOW operators (within SELECT): positive tests.
+---------------------------------------------------
+
+-- SELECT with a WINDOW clause.
+table windowTestData
+|> select cate, sum(val) over w
+   window w as (partition by cate order by val);
+
+-- SELECT with RANGE BETWEEN as part of the window definition.
+table windowTestData
+|> select cate, sum(val) over w
+   window w as (order by val_timestamp range between unbounded preceding and 
current row);
+
+-- SELECT with a WINDOW clause not being referred in the SELECT list.
+table windowTestData
+|> select cate, val
+    window w as (partition by cate order by val);
+
+-- multiple SELECT clauses, each with a WINDOW clause (with the same window 
definition names).
+table windowTestData
+|> select cate, val, sum(val) over w as sum_val
+   window w as (partition by cate)
+|> select cate, val, sum_val, first_value(cate) over w
+   window w as (order by val);
+
+-- SELECT with a WINDOW clause for multiple window definitions.
+table windowTestData
+|> select cate, val, sum(val) over w1, first_value(cate) over w2
+   window w1 as (partition by cate), w2 as (order by val);
+
+-- SELECT with a WINDOW clause for multiple window functions over one window 
definition
+table windowTestData
+|> select cate, val, sum(val) over w, first_value(val) over w
+   window w1 as (partition by cate order by val);
+
+-- SELECT with a WINDOW clause, using struct fields.
+(select col from st)
+|> select col.i1, sum(col.i2) over w
+   window w as (partition by col.i1 order by col.i2);
+
+table st
+|> select st.col.i1, sum(st.col.i2) over w
+   window w as (partition by st.col.i1 order by st.col.i2);
+
+table st
+|> select spark_catalog.default.st.col.i1, 
sum(spark_catalog.default.st.col.i2) over w
+   window w as (partition by spark_catalog.default.st.col.i1 order by 
spark_catalog.default.st.col.i2);
+
+-- SELECT with one WINDOW definition shadowing a column name.
+table windowTestData
+|> select cate, sum(val) over val
+   window val as (partition by cate order by val);
+
+-- WINDOW definition can be referred in the downstream SELECT clause.

Review Comment:
   @Angryrou the query in this test case should fail now, since the `window w` 
is not yet defined in the `|> select` operator. Do we need to re-generate the 
golden files?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to