swgillespie opened a new issue, #7824:
URL: https://github.com/apache/arrow-datafusion/issues/7824

   ### Describe the bug
   
   If you load a Parquet file that has a column of type `Map`, you can't write 
a query involving `GetIndexedField` that queries it. This would appear to be 
because `GetIndexedField` only specifically supports structs and lists and not 
maps.
   
   ### To Reproduce
   
   ```
   DataFusion CLI v31.0.0
   ❯ create external table test stored as parquet location '../scratch';
   0 rows in set. Query took 0.014 seconds.
   
   ❯ show columns from test;
   
+---------------+--------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
   | table_catalog | table_schema | table_name | column_name | data_type
   
                                                                                
                  | is_nullable |
   
+---------------+--------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
   | datafusion    | public       | test       | ints        | Map(Field { 
name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, 
nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { 
name: "value", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: 
false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, 
metadata: {} }, false) | NO          |
   | datafusion    | public       | test       | strings     | Map(Field { 
name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, 
nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { 
name: "value", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: 
false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, 
metadata: {} }, false)  | NO          |
   | datafusion    | public       | test       | timestamp   | Utf8
   
                                                                                
                  | NO          |
   
+---------------+--------------+------------+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
   ❯ select avg(ints['bytes']), strings['method'] from test group by 
strings['method'];
   Error during planning: The expression to get an indexed field is only valid 
for `List` or `Struct` types, got Map(Field { name: "entries", data_type: 
Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, 
dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: 
Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), 
nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false)
   ```
   
   ### Expected behavior
   
   I would expect the above query 
   
   ```sql
   SELECT avg(ints['bytes']), strings['method']
   FROM test 
   GROUP BY strings['method'];
   ```
   to work and produce a result set with two columns.
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to