alamb commented on code in PR #16086:
URL: https://github.com/apache/datafusion/pull/16086#discussion_r2098034125
##########
datafusion/datasource-parquet/src/opener.rs:
##########
@@ -178,7 +182,7 @@ impl FileOpener for ParquetOpener {
// Build predicates for this specific file
let (pruning_predicate, page_pruning_predicate) =
build_pruning_predicates(
predicate.as_ref(),
- &physical_file_schema,
+ &logical_file_schema,
Review Comment:
This is the actual change (from physical to logical schema) -- I am calling
this out because it took me a while to spot it (at first I thought this was
only a name change)
##########
datafusion/datasource-parquet/src/opener.rs:
##########
@@ -55,8 +55,9 @@ pub(super) struct ParquetOpener {
pub limit: Option<usize>,
/// Optional predicate to apply during the scan
pub predicate: Option<Arc<dyn PhysicalExpr>>,
- /// Schema of the output table
- pub table_schema: SchemaRef,
+ /// Schema of the output table without partition columns.
Review Comment:
I verified that when the code changes are reverted this test fails:
```shell
cargo test --all-features -p datafusion -- parquet
...
----
datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_ints
stdout ----
thread
'datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_ints'
panicked at datafusion/core/src/datasource/physical_plan/parquet.rs:927:9:
assertion `left == right` failed
left: 1
right: 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
----
datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_strings
stdout ----
thread
'datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_strings'
panicked at datafusion/core/src/datasource/physical_plan/parquet.rs:885:9:
assertion `left == right` failed
left: 1
right: 0
failures:
datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_ints
datasource::physical_plan::parquet::tests::evolved_schema_column_type_filter_strings
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]