MaxGekk commented on a change in pull request #28067: [WIP][SPARK-31297][SQL]
Speed up dates rebasing
URL: https://github.com/apache/spark/pull/28067#discussion_r400120317
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
##########
@@ -1033,6 +1033,40 @@ object DateTimeUtils {
instantToMicros(localDateTime.atZone(ZoneId.systemDefault).toInstant)
}
+ /**
+ * Rebases days since the epoch from an original to an target calendar, from
instance
+ * from a hybrid (Julian + Gregorian) to Proleptic Gregorian calendar.
+ *
+ * It finds the latest switch day which is less than `days`, and adds the
difference
+ * in days associated with the switch day to the given `days`. The function
is based
+ * on linear search which starts from the most recent switch days. This
allows to perform
+ * less comparisons for modern dates.
+ *
+ * @param switchDays The days when difference in days between original and
target
+ * calendar was changed.
+ * @param diffs The differences in days between calendars.
+ * @param days The number of days since the epoch 1970-01-01 to be rebased
to the
+ * target calendar.
+ * @return The rebased day
+ */
+ private def rebaseDays(switchDays: Array[Int], diffs: Array[Int], days:
Int): Int = {
+ var i = switchDays.length - 1
+ while (i >= 0 && days < switchDays(i)) {
+ i -= 1
+ }
+ val rebased = days + diffs(if (i < 0) 0 else i)
+ rebased
+ }
+
+ // The differences in days between Julian and Proleptic Gregorian dates.
+ // The diff at the index `i` is applicable for all days in the date interval:
+ // [julianGregDiffSwitchDay(i), julianGregDiffSwitchDay(i+1))
Review comment:
The dates before `0001-01-01` is out of supported range, the current
implementation just returns constant diff of 2 days.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]