jayzhan211 commented on PR #10713:
URL: https://github.com/apache/datafusion/pull/10713#issuecomment-2137673821

   > @jayzhan211 I'm working on a change, but can you help me understand the 
semantics here:
   > 
   > ```
   > # csv_query_distinct_variance
   > query R
   > SELECT var(distinct c2) FROM aggregate_test_100
   > ----
   > 2.5
   > 
   > statement error DataFusion error: This feature is not implemented: 
VAR\(DISTINCT\) aggregations are not available
   > SELECT var(c2), var(distinct c2) FROM aggregate_test_100
   > ```
   > 
   > Why should the first query succeed but not the second one? Feel free to 
point me to any SQL / datafusion doc.
   
   I think it is because of optimize rule `SingleDistinctToGroupBy`, this rule 
convert distinct to group by, so the first query is no longer `distinct`, you 
can try adding `explain` to see the optimized logical plan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to