alamb commented on code in PR #6734:
URL: https://github.com/apache/arrow-datafusion/pull/6734#discussion_r1238698423
##########
datafusion/core/tests/sqllogictests/test_files/groupby.slt:
##########
@@ -2076,18 +2076,18 @@ Projection: annotated_data_infinite2.a,
annotated_data_infinite2.b, FIRST_VALUE(
----TableScan: annotated_data_infinite2 projection=[a, b, c]
physical_plan
ProjectionExec: expr=[a@0 as a, b@1 as b,
FIRST_VALUE(annotated_data_infinite2.c) ORDER BY [annotated_data_infinite2.a
DESC NULLS FIRST]@2 as first_c]
---AggregateExec: mode=Single, gby=[a@0 as a, b@1 as b],
aggr=[FIRST_VALUE(annotated_data_infinite2.c)], ordering_mode=FullyOrdered
+--AggregateExec: mode=Single, gby=[a@0 as a, b@1 as b],
aggr=[LAST_VALUE(annotated_data_infinite2.c)], ordering_mode=FullyOrdered
----CsvExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/core/tests/data/window_2.csv]]}, projection=[a, b,
c], infinite_source=true, output_ordering=[a@0 ASC NULLS LAST, b@1 ASC NULLS
LAST, c@2 ASC NULLS LAST], has_header=true
query III
SELECT a, b, FIRST_VALUE(c ORDER BY a DESC) as first_c
FROM annotated_data_infinite2
GROUP BY a, b
----
-0 0 0
-0 1 25
-1 2 50
-1 3 75
+0 0 24
Review Comment:
I was just looking at the output (not the plan) -- I see now that the `ORDER
BY` is on `a` but the value is `c`
Since the query groups by `a, b` each group that `FIRST_VALUE` is evaluated
on, will have the same order of c and thus FIRST_VALUE is effectively arbitrary.
When I printed out the values in `annotated_data_infinite2` it is clearer to
me that the output of this query is "undefined" in the sense that any of the
values of `c` are acceptable (I wonder if this test will therefore be unstable
🤔 ) . Maybe we can somehow make the query more representative for the future
```
query III
select a, b, c from annotated_data_infinite2 order by a, b, c;
----
0 0 0
0 0 1
0 0 2
0 0 3
0 0 4
0 0 5
0 0 6
0 0 7
0 0 8
0 0 9
0 0 10
0 0 11
0 0 12
0 0 13
0 0 14
0 0 15
0 0 16
0 0 17
0 0 18
0 0 19
0 0 20
0 0 21
0 0 22
0 0 23
0 0 24
0 1 25
0 1 26
0 1 27
0 1 28
0 1 29
0 1 30
0 1 31
0 1 32
0 1 33
0 1 34
0 1 35
0 1 36
0 1 37
0 1 38
0 1 39
0 1 40
0 1 41
0 1 42
0 1 43
0 1 44
0 1 45
0 1 46
0 1 47
0 1 48
0 1 49
1 2 50
1 2 51
1 2 52
1 2 53
1 2 54
1 2 55
1 2 56
1 2 57
1 2 58
1 2 59
1 2 60
1 2 61
1 2 62
1 2 63
1 2 64
1 2 65
1 2 66
1 2 67
1 2 68
1 2 69
1 2 70
1 2 71
1 2 72
1 2 73
1 2 74
1 3 75
1 3 76
1 3 77
1 3 78
1 3 79
1 3 80
1 3 81
1 3 82
1 3 83
1 3 84
1 3 85
1 3 86
1 3 87
1 3 88
1 3 89
1 3 90
1 3 91
1 3 92
1 3 93
1 3 94
1 3 95
1 3 96
1 3 97
1 3 98
1 3 99
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]