ming535 commented on issue #2723:
URL: 
https://github.com/apache/arrow-datafusion/issues/2723#issuecomment-1168142381

   @yjshen Hi, do you know why `Distinct*`, for example `DistinctCount`'s 
`state_fields` is using `DataType::List`? I have ran a few examples and all of 
them actually using `List` of length `1`.
   
   My understanding is that `DistinctCount` is used in queries like: `SELECT 
COUNT(DISTINCT c1) from foo;` .
   
   The logical plan is `Aggregate: groupBy=[[]], aggr=[[COUNT(DISTINCT 
#foo.c1)]`. The `aggr` in the plan will be translated into `DistinctCount` 
(when SingleDistinctToGroupBy is disabled). Since `COUNT` is not valid when 
there are multiple columns, I don't understand why `DistinctCount`'s 
`state_fields` is a vector of `DataType::List` rather than just a vector of 
DataType.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to