crepererum commented on PR #13293:
URL: https://github.com/apache/datafusion/pull/13293#issuecomment-2470333151
> > I don't think we get a mean value from parquet for example. So that
would be a rather opinionated assumption. Also note that this is somewhat hard
or even impossible to calculate for some types (e.g. strings)
>
> Then do we need 4 states? -- Both bounds and estimation, only bounds, only
estimation, and neither one
yeah, if you want to have bounds AND a point estimator, then you need a
larger state space, something like:
```rust
struct Precision {
/// Actual values are very close to the given point.
///
/// This can be treated as an over-simplified normal distribution.
point_estimation: Option<T>,
/// Lower bound for a open/half-open/closed range.
///
/// If given, the bound is INCLUSIVE. The bounds may be
/// overestimated (i.e. the actual lower value may be larger)
/// but if provided, all values are included in this range.
lower: Option<T>,
/// Upper bound for a open/half-open/closed range.
///
/// If given, the bound is INCLUSIVE. The bounds may be
/// overestimated (i.e. the actual upper value may be smaller)
/// but if provided, all values are included in this range.
upper: Option<T>,
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]