xudong963 opened a new pull request, #22406:
URL: https://github.com/apache/datafusion/pull/22406
## Which issue does this PR close?
- Part of #22189.
## Rationale for this change
`ParquetFileMetrics::new` registers many per-file metrics with the same
`filename` label. Before this PR, each metric built its own owned filename
label with `filename.to_string()`, which repeatedly copied the same dynamic
string during parquet scan setup.
This PR keeps parquet metrics eagerly registered, so
`ExecutionPlan::metrics()`
visibility during execution is unchanged, while reducing repeated label
string
allocation and copying.
## What changes are included in this PR?
- Store owned `Label` name/value strings behind `Arc<str>` internally, while
keeping borrowed static label strings allocation-free.
- Reuse one cloned `filename` label across the per-file parquet metrics in
`ParquetFileMetrics::new`.
- Add a metrics test confirming borrowed and owned label values remain equal
and display the same way.
## Are these changes tested?
Yes.
```text
cargo fmt --all
cargo test -p datafusion-physical-expr-common metrics::tests
cargo test -p datafusion-datasource-parquet --lib
cargo clippy -p datafusion-physical-expr-common -p
datafusion-datasource-parquet --lib -- -D warnings
cargo clippy --all-targets --all-features -- -D warnings
git diff --check
```
I also ran a local targeted microbenchmark for repeated
`ParquetFileMetrics::new` construction:
```text
origin/main, 50k iterations x 9 samples:
median = 66.223 ms
rerun median = 67.423 ms
this PR, 50k iterations x 9 samples:
median = 59.283 ms
```
That is about 10-12% faster for this targeted metric construction path.
## Are there any user-facing changes?
No. Metric registration timing and displayed label values are unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]