parthchandra opened a new pull request, #4008: URL: https://github.com/apache/datafusion-comet/pull/4008
## Which issue does this PR close? Part of https://github.com/apache/datafusion-comet/issues/286 Part of https://github.com/apache/datafusion-comet/issues/378 ## Rationale for this change We currently fall back to Spark for timestamp_ntz casts ## What changes are included in this PR? Add native support for casting to and from `TimestampNTZType` (timestamp without timezone). PR Description **Implemented cast directions:** - TimestampNTZ -> String (timezone-independent) - TimestampNTZ -> Date (timezone-independent) - TimestampNTZ -> Timestamp (session-TZ dependent) - Date -> TimestampNTZ (timezone-independent) - Timestamp -> TimestampNTZ (session-TZ dependent) **Not yet implemented:** - String -> TimestampNTZ (marked `Incompatible`, tracked in #378) ### Key implementation details - **Timezone-independent casts** (NTZ↔Date, NTZ→String): Pure arithmetic on epoch microseconds; session timezone has no effect on results. - **Timezone-dependent casts** (NTZ↔Timestamp): Interprets/produces local datetimes in the session timezone. Uses `resolve_local_datetime()` helper to handle DST ambiguity (fall-back) and gaps (spring-forward) matching Spark's `ZonedDateTime` semantics. ## How are these changes tested? ### Cast-specific test coverage | Cast | Test method | Timezone coverage | Notes | |------|------------|-------------------|-------| | Date → NTZ | `cast DateType to TimestampNTZType` | 17 representative zones | Includes half-hour (Kolkata +5:30), quarter-hour (Kathmandu +5:45, Chatham +12:45) offsets | | Timestamp → NTZ | `cast TimestampType to TimestampNTZType` | 17 zones | Exercises DST transitions (Sao Paulo, Sydney, New York) | | NTZ → String | `cast TimestampNTZType to StringType` | N/A (TZ-independent) | | | NTZ → Date | `cast TimestampNTZType to DateType` | 17 zones | | | NTZ → Timestamp | `cast TimestampNTZType to TimestampType` | 17 zones | | | String → NTZ | `cast StringType to TimestampNTZType` | — | ignored; not yet implemented | ### SQL integration tests (`cast_timestamp_ntz.sql`) - NTZ → String, Date, Timestamp - Date → NTZ, Timestamp → NTZ - Literal casts (e.g. `CAST(TIMESTAMP_NTZ'2020-01-01 12:34:56.789' AS string)`) ### Test data `generateTimestampNTZ()` reuses `generateTimestampLiterals()` which covers epoch, modern dates, DST-transition dates, and sub-second precision values. ### Timezone diversity The `representativeTimezones` list (17 zones) was chosen to cover: - Standard offsets: UTC, UTC+8 (Shanghai), UTC+9 (Tokyo), UTC-5/-4 (New York) - Half-hour offsets: Asia/Kolkata (UTC+5:30) - Quarter-hour offsets: Asia/Kathmandu (UTC+5:45), Pacific/Chatham (UTC+12:45) - DST-transitioning zones: New York, Sydney, London, Sao Paulo - Non-DST zones: Dubai, Cairo, Johannesburg ### ANSI mode Each `castTimestampTest` invocation tests both `ANSI_ENABLED=false` (null on invalid input) and `ANSI_ENABLED=true` (exception on invalid input), plus `try_cast()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
