Alessandro Solimando created CALCITE-7204:
---------------------------------------------
Summary: Add support for lossless cast detection for DATETIME types
Key: CALCITE-7204
URL: https://issues.apache.org/jira/browse/CALCITE-7204
Project: Calcite
Issue Type: Improvement
Components: core
Affects Versions: 1.40.0
Reporter: Alessandro Solimando
_RexUtil.isLosslessCast_ doesn't currently support date/time types at all and
defaults to considering casts always lossy, leading to missed opportunities and
potential suboptimal planning.
The current ticket aims at adding support for the DATETIME family types.
A proposal of what to handle, which should be re-verified precisely by the
implementer:
* *TIME(p) -> TIME(p')*
lossless iff p' >= p (widening fractional-second precision)
(Reverse is not guaranteed: narrowing can round away sub-second units)
* *TIMESTAMP(p) → TIMESTAMP(p')* (without time zone): lossless iff p' >= p
* *DATE → TIMESTAMP(p)* (without time zone)
lossless (round-trip {{DATE -> TIMESTAMP -> DATE}} always recovers the original
date, the first cast adds padding like {{00:00:00[.000…]}} which can be
truncated in the second cast)
* *TIME(p) → TIMESTAMP(p')* (without time zone)
lossless iff p' >= p (round-trip {{TIME -> TIMESTAMP -> TIME}} preserves the
time component, the synthetic date part is discarded on the second cast)
* *TIMESTAMP WITH LOCAL TIME ZONE (TSLTZ)*
** {*}TSLTZ(p) -> TSLTZ(p'){*}: lossless iff p' >= p
** {*}TSLTZ <=> TIMESTAMP (without TZ){*}: *conservatively not lossless*
(semantics differ: instant vs local wall-time, the DST/offset transitions can
change wall-time on round-trip)
* Optional / out-of-scope (separate ticket/tickets):
** *DATE/TIME/TIMESTAMP <=> CHARACTER*
** *INTERVAL* types:
*** YEAR-MONTH intervals: widening fields/precision is lossless
*** DAY-SECOND intervals: widening fractional-second precision and/or field
range is lossless
Type precision: for integers types we had surprising effects (see
[here|https://github.com/apache/calcite/pull/4557#discussion_r2379703479]),
take special care in its handling and verify precisely assumptions, in doubt be
conservative as it's critical for the method to not return false positives as
it immediately affects correctness.
Tests and impact:
* The newly supported cases must (at least) be covered appropriately in
_RexLosslessCastTest_ (positive and negative tests, see CALCITE-7174 for an
example)
* When existing plans change due to further simplifications/rule firing, group
changes by "patterns" and a justification for non-trivial cases
Note: if implementing all this at once is too much, we can break it into
multiple tickets (for instance, TZ-aware cases can become a separate ticket, in
case it's fine to detect them and return false for now)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)