Angryrou commented on code in PR #48649:
URL: https://github.com/apache/spark/pull/48649#discussion_r1833741288
##########
sql/core/src/test/resources/sql-tests/inputs/pipe-operators.sql:
##########
@@ -821,6 +819,84 @@ select 1 x, 2 y, 3 z
table other
|> aggregate b group by a;
+-- WINDOW operators (within SELECT): positive tests.
+---------------------------------------------------
+
+-- SELECT with a WINDOW clause.
+table windowTestData
+|> select cate, sum(val) over w
+ window w as (partition by cate order by val);
+
+-- SELECT with RANGE BETWEEN as part of the window definition.
+table windowTestData
+|> select cate, sum(val) over w
+ window w as (order by val_timestamp range between unbounded preceding and
current row);
+
+-- SELECT with a WINDOW clause not being referred in the SELECT list.
+table windowTestData
+|> select cate, val
+ window w as (partition by cate order by val);
+
+-- multiple SELECT clauses, each with a WINDOW clause (with the same window
definition names).
+table windowTestData
+|> select cate, val, sum(val) over w as sum_val
+ window w as (partition by cate)
+|> select cate, val, sum_val, first_value(cate) over w
+ window w as (order by val);
+
+-- SELECT with a WINDOW clause for multiple window definitions.
+table windowTestData
+|> select cate, val, sum(val) over w1, first_value(cate) over w2
+ window w1 as (partition by cate), w2 as (order by val);
+
+-- SELECT with a WINDOW clause for multiple window functions over one window
definition
+table windowTestData
+|> select cate, val, sum(val) over w, first_value(val) over w
+ window w1 as (partition by cate order by val);
+
+-- SELECT with a WINDOW clause, using struct fields.
+(select col from st)
+|> select col.i1, sum(col.i2) over w
+ window w as (partition by col.i1 order by col.i2);
+
+table st
+|> select st.col.i1, sum(st.col.i2) over w
+ window w as (partition by st.col.i1 order by st.col.i2);
+
+table st
+|> select spark_catalog.default.st.col.i1,
sum(spark_catalog.default.st.col.i2) over w
+ window w as (partition by spark_catalog.default.st.col.i1 order by
spark_catalog.default.st.col.i2);
+
+-- SELECT with one WINDOW definition shadowing a column name.
+table windowTestData
+|> select cate, sum(val) over val
+ window val as (partition by cate order by val);
+
+-- WINDOW definition can be referred in the downstream SELECT clause.
Review Comment:
@cloud-fan @dtenedor Fixed! I added logic to validate every window name used
in `UnresolvedWindowExpression` is well defined in the **same** `|> SELECT`
clause.
In the previous code logic, `UnresolvedWindowExpression` is analyzed after
all pipe operators are scanned. Therefore, In this test case, the window `w` in
the first `|> SELECT` clauses could refer to the definition in the second `|>
SELECT`, just like a classic SQL shown below. After the code change, it will
complain `INVALID_SQL_SYNTAX.WINDOW_REFERENCE_NOT_FOUND` when no window clause
exists or `INVALID_SQL_SYNTAX.UNRESOLVED_WINDOW_REFERENCE` when a window name
used in `UnresolvedWindowExpression` is not defined in the same `|> SELECT`
clause.
```
-- pip SQL
table windowTestData
|> select first_value(cate) over w as first_val, cate, val
|> select cate, val, sum(val) over w as sum_val
window w as (order by val)
-- classic SQL
select cate, val, sum(val) over w as sum_val
from (
select cate, val, first_value(cate) over w as first_val
from windowTestData
)
window w as (order by val);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]