alamb commented on PR #5171: URL: https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1421451352
```sql SELECT c9, SUM(c9) OVER(ORDER BY c9 ASC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING) as sum1, SUM(c9) OVER(ORDER BY c9 DESC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING) as sum2 FROM aggregate_test_100 LIMIT 5 ``` I think the output of this query is "implementation defined" according to the SQL spec -- it is correct to produce any arbitrary 5 rows out because there is no `ORDER BY` on the outer query. If the user wants a specific set of rows they ned to add an explicit `ORDER BY`. ```sql SELECT c9, SUM(c9) OVER(ORDER BY c9 ASC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING) as sum1, SUM(c9) OVER(ORDER BY c9 DESC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING) as sum2 FROM aggregate_test_100 ORDER BY c9 ASC LIMIT 5 ``` This is a pretty bad UX choice on the part of SQL in my opinion, and causes all sorts of confusion (e.g. when test output changes from run to run!) but that is what the SQL spec says. > I don't think this is a bug. I agree. I think it is acceptable for the optimizer to rewrite the plan so the output happens to be different -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org