JiJiTang opened a new pull request #28319:
URL: https://github.com/apache/spark/pull/28319
[SPARK-31364][SQL] Benchmark Parquet Nested Field Predicate Pushdown
### What changes were proposed in this pull request?
Adding benchmark suite for nested predicate pushdown with parquet file:
Performance comparison: Nested predicate pushdown disabled vs enabled, with
the following queries scenarios:
1. When predicate pushed down, parquet reader are able to filter out all
the row groups without loading them.
2. When predicate pushed down, parquet reader only loads one of the row
groups.
3. When predicate pushed down, parquet reader can't filter out any row group
in order to see if we introduce too much overhead or not when enabling nested
predicate push down.
### Why are the changes needed?
No benchmark exists today for nested fields predicate pushdown performance
evaluation.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
Benchmark runs and reporting result.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]