Maxim Gekk created SPARK-30730:
----------------------------------

             Summary: Wrong results of `converTz` for different session and 
system time zones
                 Key: SPARK-30730
                 URL: https://issues.apache.org/jira/browse/SPARK-30730
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Maxim Gekk


Currently, DateTimeUtils.convertTz() assumes that timestamp string are casted 
to TimestampType using the JVM system timezone but in fact the session time 
zone defined by the SQL configĀ *spark.sql.session.timeZone* is used in casting. 
This leads to wrong results of from_utc_timestamp and to_utc_timestamp when 
session time zone is different from JVM time zones. The issues can be 
reproduces by the code:
{code}
  test("to_utc_timestamp in various system and session time zones") {
    val localTs = "2020-02-04T22:42:10"
    val defaultTz = TimeZone.getDefault
    try {
      DateTimeTestUtils.outstandingTimezonesIds.foreach { systemTz =>
        TimeZone.setDefault(DateTimeUtils.getTimeZone(systemTz))
        DateTimeTestUtils.outstandingTimezonesIds.foreach { sessionTz =>
          withSQLConf(
            SQLConf.DATETIME_JAVA8API_ENABLED.key -> "true",
            SQLConf.SESSION_LOCAL_TIMEZONE.key -> sessionTz) {

            DateTimeTestUtils.outstandingTimezonesIds.foreach { toTz =>
              val instant = LocalDateTime
                .parse(localTs)
                .atZone(DateTimeUtils.getZoneId(toTz))
                .toInstant
              val df = Seq(localTs).toDF("localTs")
              val res = df.select(to_utc_timestamp(col("localTs"), 
toTz)).first().apply(0)
              if (instant != res) {
                println(s"system = $systemTz session = $sessionTz to = $toTz")
              }
            }
          }
        }
      }
    } catch {
      case NonFatal(_) => TimeZone.setDefault(defaultTz)
    }
  }
{code}
{code}
system = UTC session = PST to = UTC
system = UTC session = PST to = PST
system = UTC session = PST to = CET
system = UTC session = PST to = Africa/Dakar
system = UTC session = PST to = America/Los_Angeles
system = UTC session = PST to = Antarctica/Vostok
system = UTC session = PST to = Asia/Hong_Kong
system = UTC session = PST to = Europe/Amsterdam
...
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to