[ 
https://issues.apache.org/jira/browse/SPARK-31449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Gekk updated SPARK-31449:
-------------------------------
    Issue Type: Improvement  (was: Question)

> Investigate the difference between JDK and Spark's time zone offset 
> calculation
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-31449
>                 URL: https://issues.apache.org/jira/browse/SPARK-31449
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.5
>            Reporter: Maxim Gekk
>            Priority: Major
>
> Spark 2.4 calculates time zone offsets from wall clock timestamp using 
> `DateTimeUtils.getOffsetFromLocalMillis()` (see 
> https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala#L1088-L1118):
> {code:scala}
>   private[sql] def getOffsetFromLocalMillis(millisLocal: Long, tz: TimeZone): 
> Long = {
>     var guess = tz.getRawOffset
>     // the actual offset should be calculated based on milliseconds in UTC
>     val offset = tz.getOffset(millisLocal - guess)
>     if (offset != guess) {
>       guess = tz.getOffset(millisLocal - offset)
>       if (guess != offset) {
>         // fallback to do the reverse lookup using java.sql.Timestamp
>         // this should only happen near the start or end of DST
>         val days = Math.floor(millisLocal.toDouble / MILLIS_PER_DAY).toInt
>         val year = getYear(days)
>         val month = getMonth(days)
>         val day = getDayOfMonth(days)
>         var millisOfDay = (millisLocal % MILLIS_PER_DAY).toInt
>         if (millisOfDay < 0) {
>           millisOfDay += MILLIS_PER_DAY.toInt
>         }
>         val seconds = (millisOfDay / 1000L).toInt
>         val hh = seconds / 3600
>         val mm = seconds / 60 % 60
>         val ss = seconds % 60
>         val ms = millisOfDay % 1000
>         val calendar = Calendar.getInstance(tz)
>         calendar.set(year, month - 1, day, hh, mm, ss)
>         calendar.set(Calendar.MILLISECOND, ms)
>         guess = (millisLocal - calendar.getTimeInMillis()).toInt
>       }
>     }
>     guess
>   }
> {code}
> Meanwhile, JDK's GregorianCalendar uses special methods of ZoneInfo, see 
> https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/aa318070b27849f1fe00d14684b2a40f7b29bf79/jdk/src/share/classes/java/util/GregorianCalendar.java#L2795-L2801:
> {code:java}
>             if (zone instanceof ZoneInfo) {
>                 ((ZoneInfo)zone).getOffsetsByWall(millis, zoneOffsets);
>             } else {
>                 int gmtOffset = isFieldSet(fieldMask, ZONE_OFFSET) ?
>                                     internalGet(ZONE_OFFSET) : 
> zone.getRawOffset();
>                 zone.getOffsets(millis - gmtOffset, zoneOffsets);
>             }
> {code}
> Need to investigate are there any differences in results between 2 approaches.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to