Cheappie opened a new issue, #2161:
URL: https://github.com/apache/arrow-datafusion/issues/2161
**Describe the bug**
Simply I get index out of bounds when parquet pruning is enabled.
file: metadata.rs:212:10
struct: RowGroupMetaData,
accessed field: columns
error: thread 'tokio-runtime-worker' panicked at 'index out of bounds: the
len is 1 but the index is 1'
**To Reproduce**
Create two parquet files with different fields in schema, I put 4 numbers
into each file.
```
file: sample1.parquet
message schema {
REQUIRED INT32 a;
}
file: sample2.parquet
message schema {
REQUIRED INT32 b;
}
```
code:
```
#[tokio::main]
async fn main() -> Result<()> {
// create local execution context
let mut ctx = ExecutionContext::new();
// Configure listing options
let file_format = ParquetFormat::default().with_enable_pruning(true);
let listing_options = ListingOptions {
file_extension: DEFAULT_PARQUET_EXTENSION.to_owned(),
format: Arc::new(file_format),
table_partition_cols: vec![],
collect_stat: false,
target_partitions: 1,
};
ctx.register_listing_table(
"FANCY_TABLE",
"file:///absolute-path/table/",
listing_options,
None,
).await.unwrap();
let df = ctx
.sql("SELECT * FROM FANCY_TABLE where a > 2 or b > 2")
.await?;
df.show().await?;
Ok(())
}
```
**Expected behavior**
Query executes without any issues.
When pruning is disabled, everything is fine and I receive such result.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]