brancz opened a new issue, #19782:
URL: https://github.com/apache/datafusion/issues/19782

   ### Is your feature request related to a problem or challenge?
   
   It's currently not possible to aggregate by `RunArrays`.
   
   <details>
   
   <summary>SQL Logic Test that uses `ListView` in a group by</summary>
   
   ```rust
   # Licensed to the Apache Software Foundation (ASF) under one
   # or more contributor license agreements.  See the NOTICE file
   # distributed with this work for additional information
   # regarding copyright ownership.  The ASF licenses this file
   # to you under the Apache License, Version 2.0 (the
   # "License"); you may not use this file except in compliance
   # with the License.  You may obtain a copy of the License at
   
   #   http://www.apache.org/licenses/LICENSE-2.0
   
   # Unless required by applicable law or agreed to in writing,
   # software distributed under the License is distributed on an
   # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   # KIND, either express or implied.  See the License for the
   # specific language governing permissions and limitations
   # under the License.
   
   #############
   ## ListView Aggregation Tests
   #############
   
   ### Setup: Create test tables with ListView arrays
   
   statement ok
   CREATE TABLE list_view_agg_test AS
   SELECT
       id,
       group_col,
       arrow_cast(make_array(val1, val2, val3), 'ListView(Int64)') as 
list_view_col
   FROM (VALUES
       (1, 'A', 10, 20, 30),
       (2, 'A', 40, 50, 60),
       (3, 'B', 70, 80, 90),
       (4, 'B', 100, 110, 120),
       (5, 'C', 1, 2, 3)
   ) AS t(id, group_col, val1, val2, val3);
   
   ### Test: GROUP BY on ListView column
   
   query ?I rowsort
   SELECT list_view_col, COUNT(*) FROM list_view_agg_test GROUP BY 
list_view_col;
   ----
   [1, 2, 3] 1
   [10, 20, 30] 1
   [40, 50, 60] 1
   [70, 80, 90] 1
   [100, 110, 120] 1
   
   ### Cleanup
   
   statement ok
   DROP TABLE list_view_agg_test;
   ```
   
   </details>
   
   The current error it returns is:
   ```
   1. query failed: DataFusion error: Arrow error: Not yet implemented: Row 
format support not yet implemented for: [SortField { options: SortOptions { 
descending: false, nulls_first: true }, data_type: ListView(Field { data_type: 
Int64, null
   able: true }) }]
   [SQL] SELECT list_view_col, COUNT(*) FROM list_view_agg_test GROUP BY 
list_view_col;
   at 
/Users/brancz/src/github.com/apache/datafusion/datafusion/sqllogictest/test_files/list_view_aggregation.slt:40
   ```
   
   Which sounds like the first thing to tackle is to add row format support in 
`arrow-row`, but I expect that at the very least we'll also need support in 
DataFusion's hash utils.
   
   ### Describe the solution you'd like
   
   Support doing this.
   
   ### Describe alternatives you've considered
   
   Casting from ListView to List, but that defeats the purpose of doing as 
little copies as possible.
   
   ### Additional context
   
   You currently need to run the SLT on top of 
https://github.com/apache/datafusion/pull/19355 since arrow-rs 57.2 implemented 
a number of features for ListView.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to