ahmedriza opened a new issue, #5310:
URL: https://github.com/apache/arrow-datafusion/issues/5310

   **Describe the bug**
   Given a nested list, indexing works correctly as long as the index is not 0 
or larger than the size of the list.  However, if 0 or an index larger than the 
list is given, it will throw an error similar to the following:
   ```
   Error: Arrow error: Invalid argument error: column types must match schema 
types, expected Float64 but found List(Field { name: "item", data_type: 
Float64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) at 
column index 1
   ```
   
   **To Reproduce**
   Use the attached parquet file,  
[list.parquet.gz](https://github.com/apache/arrow-datafusion/files/10759801/list.parquet.gz)
   
   This Parquet file contains a single row of data as follows:
   ```
   +------------------+---+
   |a                 |id |
   +------------------+---+
   |[1.71, 2.71, 3.71]|1  |
   +------------------+---+
   ```
   
   Example code that demonstrates the bug (after uncompressing the file):
   ```rust
   use datafusion::prelude::*;
   
   let ctx = SessionContext::new();
   ctx.register_parquet("t", "list.parquet", 
ParquetReadOptions::default()).await?;
   let df = ctx.sql("select id, a[0] from t").await?;
   df.show().await?;
   ```
   
   **Expected behavior**
   We expect to get a null value when the index is out of range. For example, 
the above code should produce the following output:
   ```
   +----+--------+
   | id | t.a[0] |
   +----+--------+
   | 1  |        |
   +----+--------+
   ```
   
   
   **Additional context**
   
   We should be able to index this correctly, and if an invalid index is given, 
that should return nulls.  Example:
   ```rust
   use datafusion::prelude::*;
   
   let ctx = SessionContext::new();
   ctx.register_parquet("t", "list.parquet", 
ParquetReadOptions::default()).await?;
   let df = ctx.sql("select id, a[0] from t").await?;
   df.show().await?;
   ```
   This should produce the following output:
   ```
   +----+--------+--------+--------+--------+----------+
   | id | t.a[0] | t.a[1] | t.a[2] | t.a[3] | t.a[100] |
   +----+--------+--------+--------+--------+----------+
   | 1  |        | 1.71   | 2.71   | 3.71   |          |
   +----+--------+--------+--------+--------+----------+
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to