korowa commented on code in PR #7385: URL: https://github.com/apache/arrow-datafusion/pull/7385#discussion_r1304469204
########## datafusion/sqllogictest/test_files/aggregate.slt: ########## @@ -1271,14 +1271,31 @@ NULL 4 29 1.260869565217 123 -117 23 NULL 5 -194 -13.857142857143 118 -101 14 NULL NULL 781 7.81 125 -117 100 -# TODO this querys output is non determinisitic (the order of the elements -# differs run to run +# TODO: array_agg_distinct output is non-determinisitic -- rewrite with array_sort(list_sort) +# unnest is also not available, so manually unnesting via CROSS JOIN +# additional count(1) forces array_agg_distinct instead of array_agg over aggregated by c2 data # # csv_query_array_agg_distinct -# query T -# SELECT array_agg(distinct c2) FROM aggregate_test_100 -# ---- -# [4, 2, 3, 5, 1] +query III +WITH indices AS ( Review Comment: Actual aggregation input is `aggregate_test_100.c2` -- in the [subquery](https://github.com/apache/arrow-datafusion/pull/7385/files#diff-a60d69cf923e18b0532e7ba3470c4b728e6b00230b3ea242f3fbe95e899c5ffeR1289). These indices are required only to check query output -- I was't able to find better way to compare non-deterministic array -- more details in the [thread](https://github.com/apache/arrow-datafusion/pull/7385#discussion_r1303147504) above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
