jayzhan211 commented on PR #10713: URL: https://github.com/apache/datafusion/pull/10713#issuecomment-2137673821
> @jayzhan211 I'm working on a change, but can you help me understand the semantics here: > > ``` > # csv_query_distinct_variance > query R > SELECT var(distinct c2) FROM aggregate_test_100 > ---- > 2.5 > > statement error DataFusion error: This feature is not implemented: VAR\(DISTINCT\) aggregations are not available > SELECT var(c2), var(distinct c2) FROM aggregate_test_100 > ``` > > Why should the first query succeed but not the second one? Feel free to point me to any SQL / datafusion doc. I think it is because of optimize rule `SingleDistinctToGroupBy`, this rule convert distinct to group by, so the first query is no longer `distinct`, you can try adding `explain` to see the optimized logical plan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org