Maxim Gekk created SPARK-31449:
----------------------------------

             Summary: Is there a difference between JDK and Spark's time zone 
offset calculation
                 Key: SPARK-31449
                 URL: https://issues.apache.org/jira/browse/SPARK-31449
             Project: Spark
          Issue Type: Question
          Components: SQL
    Affects Versions: 2.4.5
            Reporter: Maxim Gekk


Spark 2.4 calculates time zone offsets from wall clock timestamp using 
`DateTimeUtils.getOffsetFromLocalMillis()` (see 
https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L1088-L1118):
{code:scala}
  private[sql] def getOffsetFromLocalMillis(millisLocal: Long, tz: TimeZone): 
Long = {
    var guess = tz.getRawOffset
    // the actual offset should be calculated based on milliseconds in UTC
    val offset = tz.getOffset(millisLocal - guess)
    if (offset != guess) {
      guess = tz.getOffset(millisLocal - offset)
      if (guess != offset) {
        // fallback to do the reverse lookup using java.sql.Timestamp
        // this should only happen near the start or end of DST
        val days = Math.floor(millisLocal.toDouble / MILLIS_PER_DAY).toInt
        val year = getYear(days)
        val month = getMonth(days)
        val day = getDayOfMonth(days)

        var millisOfDay = (millisLocal % MILLIS_PER_DAY).toInt
        if (millisOfDay < 0) {
          millisOfDay += MILLIS_PER_DAY.toInt
        }
        val seconds = (millisOfDay / 1000L).toInt
        val hh = seconds / 3600
        val mm = seconds / 60 % 60
        val ss = seconds % 60
        val ms = millisOfDay % 1000
        val calendar = Calendar.getInstance(tz)
        calendar.set(year, month - 1, day, hh, mm, ss)
        calendar.set(Calendar.MILLISECOND, ms)
        guess = (millisLocal - calendar.getTimeInMillis()).toInt
      }
    }
    guess
  }
{code}

Meanwhile, JDK's GregorianCalendar uses special methods of ZoneInfo, see 
https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/aa318070b27849f1fe00d14684b2a40f7b29bf79/jdk/src/share/classes/java/util/GregorianCalendar.java#L2795-L2801:
{code:java}
            if (zone instanceof ZoneInfo) {
                ((ZoneInfo)zone).getOffsetsByWall(millis, zoneOffsets);
            } else {
                int gmtOffset = isFieldSet(fieldMask, ZONE_OFFSET) ?
                                    internalGet(ZONE_OFFSET) : 
zone.getRawOffset();
                zone.getOffsets(millis - gmtOffset, zoneOffsets);
            }
{code}

Need to investigate are there any differences in results between 2 approaches.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to