[ https://issues.apache.org/jira/browse/SPARK-31449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maxim Gekk updated SPARK-31449: ------------------------------- Issue Type: Improvement (was: Question) > Investigate the difference between JDK and Spark's time zone offset > calculation > ------------------------------------------------------------------------------- > > Key: SPARK-31449 > URL: https://issues.apache.org/jira/browse/SPARK-31449 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.5 > Reporter: Maxim Gekk > Priority: Major > > Spark 2.4 calculates time zone offsets from wall clock timestamp using > `DateTimeUtils.getOffsetFromLocalMillis()` (see > https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L1088-L1118): > {code:scala} > private[sql] def getOffsetFromLocalMillis(millisLocal: Long, tz: TimeZone): > Long = { > var guess = tz.getRawOffset > // the actual offset should be calculated based on milliseconds in UTC > val offset = tz.getOffset(millisLocal - guess) > if (offset != guess) { > guess = tz.getOffset(millisLocal - offset) > if (guess != offset) { > // fallback to do the reverse lookup using java.sql.Timestamp > // this should only happen near the start or end of DST > val days = Math.floor(millisLocal.toDouble / MILLIS_PER_DAY).toInt > val year = getYear(days) > val month = getMonth(days) > val day = getDayOfMonth(days) > var millisOfDay = (millisLocal % MILLIS_PER_DAY).toInt > if (millisOfDay < 0) { > millisOfDay += MILLIS_PER_DAY.toInt > } > val seconds = (millisOfDay / 1000L).toInt > val hh = seconds / 3600 > val mm = seconds / 60 % 60 > val ss = seconds % 60 > val ms = millisOfDay % 1000 > val calendar = Calendar.getInstance(tz) > calendar.set(year, month - 1, day, hh, mm, ss) > calendar.set(Calendar.MILLISECOND, ms) > guess = (millisLocal - calendar.getTimeInMillis()).toInt > } > } > guess > } > {code} > Meanwhile, JDK's GregorianCalendar uses special methods of ZoneInfo, see > https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/aa318070b27849f1fe00d14684b2a40f7b29bf79/jdk/src/share/classes/java/util/GregorianCalendar.java#L2795-L2801: > {code:java} > if (zone instanceof ZoneInfo) { > ((ZoneInfo)zone).getOffsetsByWall(millis, zoneOffsets); > } else { > int gmtOffset = isFieldSet(fieldMask, ZONE_OFFSET) ? > internalGet(ZONE_OFFSET) : > zone.getRawOffset(); > zone.getOffsets(millis - gmtOffset, zoneOffsets); > } > {code} > Need to investigate are there any differences in results between 2 approaches. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org