Bruce Robbins created SPARK-40002: ------------------------------------- Summary: Limit pushed down through window using ntile function Key: SPARK-40002 URL: https://issues.apache.org/jira/browse/SPARK-40002 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.2, 3.3.0 Reporter: Bruce Robbins
Limit is pushed down through a window using the ntile function, which causes results that differ from Hive 2.3.9, and Prestodb 0.268, and older versions of Spark (e.g., 3.1.3). Assume this data: {noformat} create table t1 stored as parquet as select * from range(101); {noformat} Also assume this query: {noformat} select id, ntile(10) over (order by id) as nt from t1 limit 10; {noformat} Spark 3.2.2, Spark 3.3.0, and master produce the following: {noformat} +---+---+ |id |nt | +---+---+ |0 |1 | |1 |2 | |2 |3 | |3 |4 | |4 |5 | |5 |6 | |6 |7 | |7 |8 | |8 |9 | |9 |10 | +---+---+ {noformat} However, Spark 3.1.3, Hive 2.3.9, and Prestodb 0.268 produce the following: {noformat} +---+---+ |id |nt | +---+---+ |0 |1 | |1 |1 | |2 |1 | |3 |1 | |4 |1 | |5 |1 | |6 |1 | |7 |1 | |8 |1 | |9 |1 | +---+---+ {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org