[GitHub] [arrow-datafusion] ahmedriza commented on issue #3617: Feature request for support for struct and arry data types

via GitHub Wed, 15 Feb 2023 03:34:36 -0800


ahmedriza commented on issue #3617:
URL: 
https://github.com/apache/arrow-datafusion/issues/3617#issuecomment-1431227784


   If we take the Parquet provided by @kesavkolla, we have the following 
column, `text` whose Parquet schema is:
   ```
    |-- text: struct (nullable = true)
    |    |-- id: string (nullable = true)
    |    |-- extension: array (nullable = true)
    |    |    |-- element: string (containsNull = true)
    |    |-- status: string (nullable = true)
    |    |-- div: string (nullable = true)
   ```
   and sample data:
   ```
   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | text                                                                       
                                                                                
                            |
   
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   |                                                                            
                                                                                
                            |
   | {"id": null, "extension": null, "status": "generated", ...
   ```
   I tried the following SQL to select one of the fields in the `struct`:
   ```
   ctx.register_parquet("t", "t.parquet", ParquetReadOptions::default()).await?;
   ctx.sql("select t.text['id'] from t").await?;
   ```
   
   However, this resulted in the following error:
   ```
   Error: Arrow error: External error: Execution error: Job zlH3pzz failed: 
Error planning job zlH3pzz: DataFusionError(Internal("physical_plan::to_proto() 
unsupported expression GetIndexedFieldExpr { arg: Column { name: \"text\", 
index: 0 }, key: Utf8(\"id\") }"))
   ```
   
   Looking at `datafusion/proto/src/physical_plan/to_proto.rs` it does appear 
that this is not supported at present.  Or perhaps I have made a mistake in my 
SQL?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] ahmedriza commented on issue #3617: Feature request for support for struct and arry data types

Reply via email to