alamb commented on PR #5171:
URL:
https://github.com/apache/arrow-datafusion/pull/5171#issuecomment-1421451352
```sql
SELECT
c9,
SUM(c9) OVER(ORDER BY c9 ASC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING)
as sum1,
SUM(c9) OVER(ORDER BY c9 DESC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING)
as sum2
FROM aggregate_test_100
LIMIT 5
```
I think the output of this query is "implementation defined" according to
the SQL spec -- it is correct to produce any arbitrary 5 rows out because there
is no `ORDER BY` on the outer query. If the user wants a specific set of rows
they ned to add an explicit `ORDER BY`.
```sql
SELECT
c9,
SUM(c9) OVER(ORDER BY c9 ASC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING)
as sum1,
SUM(c9) OVER(ORDER BY c9 DESC ROWS BETWEEN 1 PRECEDING AND 5 FOLLOWING)
as sum2
FROM aggregate_test_100
ORDER BY c9 ASC
LIMIT 5
```
This is a pretty bad UX choice on the part of SQL in my opinion, and causes
all sorts of confusion (e.g. when test output changes from run to run!) but
that is what the SQL spec says.
> I don't think this is a bug.
I agree. I think it is acceptable for the optimizer to rewrite the plan so
the output happens to be different
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]