Alessandro Solimando created CALCITE-7204:
---------------------------------------------

             Summary: Add support for lossless cast detection for DATETIME types
                 Key: CALCITE-7204
                 URL: https://issues.apache.org/jira/browse/CALCITE-7204
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.40.0
            Reporter: Alessandro Solimando


_RexUtil.isLosslessCast_ doesn't currently support date/time types at all and 
defaults to considering casts always lossy, leading to missed opportunities and 
potential suboptimal planning.

The current ticket aims at adding support for the DATETIME family types.

A proposal of what to handle, which should be re-verified precisely by the 
implementer:
 * *TIME(p) -> TIME(p')*
lossless iff p' >= p (widening fractional-second precision)
(Reverse is not guaranteed: narrowing can round away sub-second units)
 * *TIMESTAMP(p) → TIMESTAMP(p')* (without time zone): lossless iff p' >= p
 * *DATE → TIMESTAMP(p)* (without time zone)
lossless (round-trip {{DATE -> TIMESTAMP -> DATE}} always recovers the original 
date, the first cast adds padding like {{00:00:00[.000…]}} which can be 
truncated in the second cast)
 * *TIME(p) → TIMESTAMP(p')* (without time zone)
lossless iff p' >= p (round-trip {{TIME -> TIMESTAMP -> TIME}} preserves the 
time component, the synthetic date part is discarded on the second cast)
 * *TIMESTAMP WITH LOCAL TIME ZONE (TSLTZ)*
 ** {*}TSLTZ(p) -> TSLTZ(p'){*}: lossless iff p' >= p

 ** {*}TSLTZ <=> TIMESTAMP (without TZ){*}: *conservatively not lossless* 
(semantics differ: instant vs local wall-time, the DST/offset transitions can 
change wall-time on round-trip)

 * Optional / out-of-scope (separate ticket/tickets):

 ** *DATE/TIME/TIMESTAMP <=> CHARACTER*

 ** *INTERVAL* types:

 *** YEAR-MONTH intervals: widening fields/precision is lossless

 *** DAY-SECOND intervals: widening fractional-second precision and/or field 
range is lossless

Type precision: for integers types we had surprising effects (see 
[here|https://github.com/apache/calcite/pull/4557#discussion_r2379703479]), 
take special care in its handling and verify precisely assumptions, in doubt be 
conservative as it's critical for the method to not return false positives as 
it immediately affects correctness.

Tests and impact:
 * The newly supported cases must (at least) be covered appropriately in 
_RexLosslessCastTest_ (positive and negative tests, see CALCITE-7174 for an 
example)
 * When existing plans change due to further simplifications/rule firing, group 
changes by "patterns" and a justification for non-trivial cases

Note: if implementing all this at once is too much, we can break it into 
multiple tickets (for instance, TZ-aware cases can become a separate ticket, in 
case it's fine to detect them and return false for now)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to