(spark) branch master updated: [SPARK-55702][SQL][FOLLOWUP] Clean up dead error code and fix flaky window filter test

yao Sat, 28 Feb 2026 07:02:20 -0800

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 959cc9504835 [SPARK-55702][SQL][FOLLOWUP] Clean up dead error code and 
fix flaky window filter test
959cc9504835 is described below

commit 959cc95048354dc47668d05268b388397e282fa1
Author: Wenchen Fan <[email protected]>
AuthorDate: Sat Feb 28 23:01:42 2026 +0800

    [SPARK-55702][SQL][FOLLOWUP] Clean up dead error code and fix flaky window 
filter test
    
    ### What changes were proposed in this pull request?
    
    Follow-up to #54501. Two cleanups:
    
    1. **Remove dead error code**: The 
`windowAggregateFunctionWithFilterNotSupportedError` method in 
`QueryCompilationErrors.scala` and its `_LEGACY_ERROR_TEMP_1030` error class in 
`error-conditions.json` were left behind after #54501 removed their only call 
site.
    
    2. **Fix flaky `first_value`/`last_value` test**: The window filter test 
used `ORDER BY val_long` with a ROWS frame, but `val_long` has duplicate values 
in the test data (e.g., three rows with `val_long=1`), making 
`first_value`/`last_value` results non-deterministic. Added `val` and `cate` as 
tiebreaker columns and used `NULLS LAST` so the output is both stable and 
meaningful (without NULLS LAST, the first matching 'a' row has `val=NULL`, 
making `first_a` always NULL).
    
    ### Why are the changes needed?
    
    1. Dead code should be cleaned up.
    2. Non-deterministic tests can cause spurious failures.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Re-ran `SQLQueryTestSuite` for `window.sql` — all 4 tests pass across all 
config dimensions.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Yes. cursor
    
    Closes #54557 from cloud-fan/follow.
    
    Authored-by: Wenchen Fan <[email protected]>
    Signed-off-by: Kent Yao <[email protected]>
---
 common/utils/src/main/resources/error/error-conditions.json  |  5 -----
 .../org/apache/spark/sql/errors/QueryCompilationErrors.scala |  6 ------
 .../test/resources/sql-tests/analyzer-results/window.sql.out | 10 +++++-----
 sql/core/src/test/resources/sql-tests/inputs/window.sql      |  6 +++---
 sql/core/src/test/resources/sql-tests/results/window.sql.out | 12 ++++++------
 5 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-conditions.json 
b/common/utils/src/main/resources/error/error-conditions.json
index b76e3b5c8d56..6c2a648ec52e 100644
--- a/common/utils/src/main/resources/error/error-conditions.json
+++ b/common/utils/src/main/resources/error/error-conditions.json
@@ -8008,11 +8008,6 @@
       "count(<targetString>.*) is not allowed. Please use count(*) or expand 
the columns manually, e.g. count(col1, col2)."
     ]
   },
-  "_LEGACY_ERROR_TEMP_1030" : {
-    "message" : [
-      "Window aggregate function with filter predicate is not supported yet."
-    ]
-  },
   "_LEGACY_ERROR_TEMP_1031" : {
     "message" : [
       "It is not allowed to use a window function inside an aggregate 
function. Please use the inner window function in a sub-query."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 8cdd734def4a..edf2dfe545c7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -836,12 +836,6 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase with Compilat
       messageParameters = Map("expression" -> expression))
   }
 
-  def windowAggregateFunctionWithFilterNotSupportedError(): Throwable = {
-    new AnalysisException(
-      errorClass = "_LEGACY_ERROR_TEMP_1030",
-      messageParameters = Map.empty)
-  }
-
   def windowFunctionInsideAggregateFunctionNotAllowedError(): Throwable = {
     new AnalysisException(
       errorClass = "_LEGACY_ERROR_TEMP_1031",
diff --git 
a/sql/core/src/test/resources/sql-tests/analyzer-results/window.sql.out 
b/sql/core/src/test/resources/sql-tests/analyzer-results/window.sql.out
index 76c0fb1919ce..11240c52e9c8 100644
--- a/sql/core/src/test/resources/sql-tests/analyzer-results/window.sql.out
+++ b/sql/core/src/test/resources/sql-tests/analyzer-results/window.sql.out
@@ -688,17 +688,17 @@ Project [cate#x, sum(val) OVER (PARTITION BY cate ORDER 
BY val ASC NULLS FIRST R
 
 -- !query
 SELECT val, cate,
-first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS first_a,
-last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS last_a
-FROM testData ORDER BY val_long, cate
+FROM testData ORDER BY val_long NULLS LAST, val NULLS LAST, cate NULLS LAST
 -- !query analysis
 Project [val#x, cate#x, first_a#x, last_a#x]
-+- Sort [val_long#xL ASC NULLS FIRST, cate#x ASC NULLS FIRST], true
++- Sort [val_long#xL ASC NULLS LAST, val#x ASC NULLS LAST, cate#x ASC NULLS 
LAST], true
    +- Project [val#x, cate#x, first_a#x, last_a#x, val_long#xL]
       +- Project [val#x, cate#x, _w0#x, val_long#xL, first_a#x, last_a#x, 
first_a#x, last_a#x]
-         +- Window [first_value(val#x, false) FILTER (WHERE _w0#x) 
windowspecdefinition(val_long#xL ASC NULLS FIRST, 
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS 
first_a#x, last_value(val#x, false) FILTER (WHERE _w0#x) 
windowspecdefinition(val_long#xL ASC NULLS FIRST, 
specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS 
last_a#x], [val_long#xL ASC NULLS FIRST]
+         +- Window [first_value(val#x, false) FILTER (WHERE _w0#x) 
windowspecdefinition(val_long#xL ASC NULLS LAST, val#x ASC NULLS LAST, cate#x 
ASC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), 
currentrow$())) AS first_a#x, last_value(val#x, false) FILTER (WHERE _w0#x) 
windowspecdefinition(val_long#xL ASC NULLS LAST, val#x ASC NULLS LAST, cate#x 
ASC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), 
currentrow$())) AS last_a#x], [val_long#xL ASC NULLS  [...]
             +- Project [val#x, cate#x, (cate#x = a) AS _w0#x, val_long#xL]
                +- SubqueryAlias testdata
                   +- View (`testData`, [val#x, val_long#xL, val_double#x, 
val_date#x, val_timestamp#x, cate#x])
diff --git a/sql/core/src/test/resources/sql-tests/inputs/window.sql 
b/sql/core/src/test/resources/sql-tests/inputs/window.sql
index 586fe88ac305..3a453e1c80e7 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/window.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/window.sql
@@ -184,11 +184,11 @@ WINDOW w AS (PARTITION BY cate ORDER BY val);
 
 -- window aggregate with filter predicate: first_value/last_value (imperative 
aggregate)
 SELECT val, cate,
-first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS first_a,
-last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS last_a
-FROM testData ORDER BY val_long, cate;
+FROM testData ORDER BY val_long NULLS LAST, val NULLS LAST, cate NULLS LAST;
 
 -- window aggregate with filter predicate: multiple aggregates with different 
filters
 SELECT val, cate,
diff --git a/sql/core/src/test/resources/sql-tests/results/window.sql.out 
b/sql/core/src/test/resources/sql-tests/results/window.sql.out
index 44c3b175868d..3ee7673df641 100644
--- a/sql/core/src/test/resources/sql-tests/results/window.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/window.sql.out
@@ -669,23 +669,23 @@ b 6
 
 -- !query
 SELECT val, cate,
-first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+first_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS first_a,
-last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long
+last_value(val) FILTER (WHERE cate = 'a') OVER(ORDER BY val_long NULLS LAST, 
val NULLS LAST, cate NULLS LAST
   ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS last_a
-FROM testData ORDER BY val_long, cate
+FROM testData ORDER BY val_long NULLS LAST, val NULLS LAST, cate NULLS LAST
 -- !query schema
 struct<val:int,cate:string,first_a:int,last_a:int>
 -- !query output
-NULL   NULL    1       NULL
-1      b       1       NULL
+1      a       1       1
 3      NULL    1       1
 NULL   a       1       NULL
 1      a       1       1
-1      a       1       1
 2      b       1       1
 2      a       1       2
 3      b       1       2
+1      b       1       2
+NULL   NULL    1       2
 
 
 -- !query


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55702][SQL][FOLLOWUP] Clean up dead error code and fix flaky window filter test

Reply via email to