alamb opened a new pull request, #22071: URL: https://github.com/apache/datafusion/pull/22071
## Which issue does this PR close? - Related to #14896 - Related to #21120 ## Rationale for this change The "Statistics V2" framework introduced in #14699 (`Distribution` enum, `PhysicalExpr::evaluate_statistics`/`propagate_statistics`, `ExprStatisticsGraph`, and supporting helpers in `datafusion-expr-common::statistics`) was always intended as the foundation for replacing `Precision`-based column statistics with richer probabilistic distributions (see #14896). In practice it has never been wired into the optimizer or any execution operator. PR #14699 was merged on 2025-02-24, ~15 months ago. Since then, the only commits touching `datafusion/expr-common/src/statistics.rs` and `datafusion/physical-expr/src/statistics/` have been mechanical (Rust-edition migrations, lint-driven refactors, renames of unrelated `Interval` constants, removal of `as_any`) — no operator or planner has been taught to call `evaluate_statistics` / `propagate_statistics` or construct a `Distribution` outside of the framework's own tests. Meanwhile, #21120 lays out a different, simpler direction: a pluggable `ExpressionAnalyzer` chain-of-responsibility for expression-level statistics. That issue explicitly describes the V2 distribution-based API as "significantly more complex to implement and adopt" and proposes that distribution-based estimation, if useful, be plugged in later as a custom analyzer rather than as a `PhysicalExpr` trait surface. Rather than continue carrying an unused public framework that we don't intend to build on, this PR deprecates it so downstream users don't start building on top of it before it is removed. ## What changes are included in this PR? This PR adds `#[deprecated(since = "54.0.0", ...)]` attributes to the public abstractions introduced in #14699 — the `Distribution` enum and its variant structs, the `evaluate_statistics` / `propagate_statistics` trait methods on `PhysicalExpr`, `ExprStatisticsGraph` / `ExprStatisticsGraphNode`, and the supporting helper functions — and updates internal callers to silence the resulting deprecation warnings (with `#[expect(deprecated)]` or module-level `#![allow(deprecated)]` where appropriate). The deprecation note on each item points to #21120 for the new direction. No behavior changes; the V2 code paths still compile and run, so any out-of-tree consumer that has already adopted them sees a deprecation warning rather than a breakage. ## Are these changes tested? No new tests; the existing tests for the deprecated items continue to pass. ## Are there any user-facing changes? The public API items listed above are now marked `#[deprecated]`. Downstream code that uses them will see a compiler warning pointing to #21120, but will continue to compile and run unchanged. The deprecated items will be removed in a future release. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
