alamb opened a new issue, #13263:
URL: https://github.com/apache/datafusion/issues/13263

   ### Is your feature request related to a problem or challenge?
   
   In https://github.com/apache/datafusion/pull/12269 @jayzhan211 made 
significant improvements to how group values are stored in multi-column 
aggregations. 
   
   Specifically for queries like
   
   ```sql
   SELECT ... FROM ... GROUP BY col1, ... colN
   ```
   
   The improvement relies on implementing specialized versions of 
[`GroupColumn`](https://github.com/apache/datafusion/blob/a6586ccfffb569461e357817529ecf6647c6d62e/datafusion/physical-plan/src/aggregates/group_values/group_column.rs#L53)
 for the types of `col1`, `colN` 
   
   We have implemented the primitive types and Strings/StringViews now, but we 
have not implemented all types
   
   This means queries like 
   
   
   ```sql
   SELECT ... FROM ... GROUP BY int_col, date_col
   ```
   
   Will fall back to the slower (but general) `GroupValuesRows`: 
https://github.com/apache/datafusion/blob/a6586ccfffb569461e357817529ecf6647c6d62e/datafusion/physical-plan/src/aggregates/group_values/row.rs#L40-L41
   
   ### Describe the solution you'd like
   
   Implement `GroupColumn` for all primitive types.
   
   You can see how to do this here:
   
   
https://github.com/apache/datafusion/blob/e4bd57918b1725f37f3f500c503c07b1c1bf90bf/datafusion/physical-plan/src/aggregates/group_values/mod.rs#L117-L121
   
   and the make sure there are tests for each of those types in queries that 
group on multiple columns
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   Here is an example for how this was done for Strings: 
https://github.com/apache/datafusion/pull/12809


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to