discord9 opened a new pull request, #22906:
URL: https://github.com/apache/datafusion/pull/22906
## Which issue does this PR close?
- Closes #.
## Rationale for this change
This is a draft follow-up to the timestamp precision narrowing discussion in
cast predicate simplification.
The previous cast unwrap path could only rewrite predicates by moving the
original comparison operator from `CAST(expr AS target_type) OP literal` to
`expr OP casted_literal`. That shape is not correct for many-to-one casts such
as timestamp precision narrowing, where the source-domain preimage of one
target timestamp value is a range rather than a singleton.
For example, `CAST(ts_ns AS Timestamp(ms)) > TimestampMillisecond(1000)`
should not become `ts_ns > TimestampNanosecond(1000000000)`. The correct
source-domain boundary is based on the timestamp bucket preimage and becomes
`ts_ns >= TimestampNanosecond(1001000000)`.
## What changes are included in this PR?
- Adds a shared `CastPredicatePreimage` abstraction in
`datafusion-expr-common`:
- `Exact(ScalarValue)` for singleton source-domain preimages.
- `Range(Interval)` for half-open source-domain intervals.
- Adds shared cast predicate preimage helpers for logical and physical
rewrites.
- Implements timestamp precision narrowing preimages using half-open buckets
with truncation-toward-zero semantics, including negative timestamp values.
- Replaces the logical optimizer's cast unwrap module with a cast preimage
module.
- Updates the physical simplifier to reuse the same shared cast preimage
helper.
- Keeps existing exact literal cast rewrites, including exact
string-to-integer equality rewrites with round-trip checks.
- Updates the optimizer integration snapshot for timestamp precision
narrowing to use the correct source-domain boundary.
## Are these changes tested?
Yes. This PR adds/updates tests for:
- exact cast predicate preimages,
- string-to-integer equality preimages with round-trip rejection such as
`'0123'`,
- timestamp precision narrowing range preimages for positive and negative
timestamp values,
- logical optimizer rewrites for timestamp precision narrowing,
- physical simplifier rewrites for timestamp precision narrowing.
Validated locally with:
```bash
cargo test -p datafusion-expr-common cast_predicate_preimage
cargo test -p datafusion-optimizer cast_preimage
cargo test -p datafusion-physical-expr
timestamp_precision_narrowing_range_preimage
```
## Are there any user-facing changes?
No user-facing API changes are intended. This is an optimizer
correctness/refactoring change for cast predicate rewrites.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]