Re: [PR] Convert variance sample to udaf [datafusion]

via GitHub Wed, 29 May 2024 08:18:38 -0700


jayzhan211 commented on PR #10713:
URL: https://github.com/apache/datafusion/pull/10713#issuecomment-2137673821


   > @jayzhan211 I'm working on a change, but can you help me understand the 
semantics here:
   > 
   > ```
   > # csv_query_distinct_variance
   > query R
   > SELECT var(distinct c2) FROM aggregate_test_100
   > ----
   > 2.5
   > 
   > statement error DataFusion error: This feature is not implemented: 
VAR\(DISTINCT\) aggregations are not available
   > SELECT var(c2), var(distinct c2) FROM aggregate_test_100
   > ```
   > 
   > Why should the first query succeed but not the second one? Feel free to 
point me to any SQL / datafusion doc.
   
   I think it is because of optimize rule `SingleDistinctToGroupBy`, this rule 
convert distinct to group by, so the first query is no longer `distinct`, you 
can try adding `explain` to see the optimized logical plan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Convert variance sample to udaf [datafusion]

Reply via email to