parthchandra commented on code in PR #3865:
URL: https://github.com/apache/datafusion-comet/pull/3865#discussion_r3060900883
##########
native/spark-expr/src/utils.rs:
##########
@@ -174,6 +174,19 @@ fn datetime_cast_err(value: i64) -> ArrowError {
))
}
+fn resolve_local_datetime(tz: &Tz, local_datetime: NaiveDateTime) ->
DateTime<Tz> {
+ match tz.from_local_datetime(&local_datetime) {
+ LocalResult::Single(dt) => dt,
+ LocalResult::Ambiguous(dt, _) => dt,
+ LocalResult::None => {
+ // Interpret nonexistent local time by shifting from one hour
earlier.
+ let shift = TimeDelta::hours(1);
Review Comment:
Consider doing something like in string.rs `parse_timestamp_to_micros`.
Something like -
```
LocalResult::None => {
let probe = local_datetime - chrono::Duration::hours(3);
let pre_offset = match tz.from_local_datetime(&probe) {
LocalResult::Single(dt) => dt.offset().fix(),
LocalResult::Ambiguous(dt, _) => dt.offset().fix(),
// Cannot determine offset – return some safe fallback or
propagate error
LocalResult::None => {
// Extreme edge case; fall back to UTC interpretation
local_datetime.and_utc().with_timezone(tz)
}
};
let offset_secs = pre_offset.local_minus_utc() as i64;
let utc_naive = local_datetime -
chrono::Duration::seconds(offset_secs);
utc_naive.and_utc().with_timezone(tz)
}
```
This eliminates the unwrap(), handles non-standard DST correctly, and reuses
a pattern that already has test coverage in Comet.
##########
spark/src/test/scala/org/apache/comet/CometTemporalExpressionSuite.scala:
##########
@@ -489,4 +489,22 @@ class CometTemporalExpressionSuite extends CometTestBase
with AdaptiveSparkPlanH
dummyDF.selectExpr("unix_date(cast(NULL as date))"))
}
}
+
+ test("cast TimestampNTZ to Timestamp - DST edge cases") {
+ val data = Seq(
+ Row(java.time.LocalDateTime.parse("2024-03-31T01:30:00")), // Spring
forward (Europe/London)
+ Row(java.time.LocalDateTime.parse("2024-10-27T01:30:00")) // Fall back
(Europe/London)
+ )
+ val schema = StructType(Seq(StructField("ts_ntz",
DataTypes.TimestampNTZType, true)))
+ spark
+ .createDataFrame(spark.sparkContext.parallelize(data), schema)
+ .createOrReplaceTempView("dst_tbl")
+
+ withSQLConf(
+ SQLConf.SESSION_LOCAL_TIMEZONE.key -> "Europe/London",
+ "spark.comet.expression.Cast.allowIncompatible" -> "true") {
Review Comment:
Can you add a comment why `allowIncompatible` is required here (because
TimestampNTZ cast is classified as Incompatible due to issue #378/#3179)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]