ahmedriza commented on issue #3617:
URL:
https://github.com/apache/arrow-datafusion/issues/3617#issuecomment-1431227784
If we take the Parquet provided by @kesavkolla, we have the following
column, `text` whose Parquet schema is:
```
|-- text: struct (nullable = true)
| |-- id: string (nullable = true)
| |-- extension: array (nullable = true)
| | |-- element: string (containsNull = true)
| |-- status: string (nullable = true)
| |-- div: string (nullable = true)
```
and sample data:
```
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| text
|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
| {"id": null, "extension": null, "status": "generated", ...
```
I tried the following SQL to select one of the fields in the `struct`:
```
ctx.register_parquet("t", "t.parquet", ParquetReadOptions::default()).await?;
ctx.sql("select t.text['id'] from t").await?;
```
However, this resulted in the following error:
```
Error: Arrow error: External error: Execution error: Job zlH3pzz failed:
Error planning job zlH3pzz: DataFusionError(Internal("physical_plan::to_proto()
unsupported expression GetIndexedFieldExpr { arg: Column { name: \"text\",
index: 0 }, key: Utf8(\"id\") }"))
```
Looking at `datafusion/proto/src/physical_plan/to_proto.rs` it does appear
that this is not supported at present. Or perhaps I have made a mistake in my
SQL?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]