This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push: new 2249653 [SPARK-31328][SQL] Fix rebasing of overlapped local timestamps during daylight saving time 2249653 is described below commit 22496535c96a6d210640d240f338840e9f655c49 Author: Maxim Gekk <max.g...@gmail.com> AuthorDate: Fri Apr 3 04:35:31 2020 +0000 [SPARK-31328][SQL] Fix rebasing of overlapped local timestamps during daylight saving time ### What changes were proposed in this pull request? 1. Fix the `rebaseGregorianToJulianMicros()` function in `DateTimeUtils` by passing the daylight saving offset associated with the input `micros` to the constructed instance of `GregorianCalendar`. The problem is in `cal.getTimeInMillis` which returns earliest instant in the case of local date-time overlaps, see https://github.com/AdoptOpenJDK/openjdk-jdk8u/blob/master/jdk/src/share/classes/java/util/GregorianCalendar.java#L2783-L2786 . I fixed the issue by keeping the standard zone o [...] 2. Fix `rebaseJulianToGregorianMicros()` by changing resulted zoned date-time if `DST_OFFSET` is zero which means the input date-time has passed an autumn daylight savings cutover. So, I take the latest local timestamp out of 2 overlapped timestamps. Otherwise I return a zoned date-time w/o any modification because it is equal to calling the `withEarlierOffsetAtOverlap()` method, so, we can optimize the case. ### Why are the changes needed? This fixes the bug of loosing of DST offset info in rebasing timestamps via local date-time. For example, there are 2 different timestamps in the `America/Los_Angeles` time zone: `2019-11-03T01:00:00-07:00` and `2019-11-03T01:00:00-08:00`, though they are mapped to the same local date-time `2019-11-03T01:00`, see <img width="456" alt="Screen Shot 2020-04-02 at 10 19 24" src="https://user-images.githubusercontent.com/1580697/78245697-95a7da00-74f0-11ea-9eba-c08138851cb3.png"> Currently, the UTC timestamp `2019-11-03T09:00:00Z` is converted to `2019-11-03T01:00:00-08:00`, and then to `2019-11-03T01:00:00` (in the original calendar, for instance Proleptic Gregorian calendar) and back to the UTC timestamp `2019-11-03T08:00:00Z` (in the hybrid calendar - Gregorian for the timestamp). That's wrong because the local timestamp must be converted to the original timestamp `2019-11-03T09:00:00Z`. ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? - Added a test to `DateTimeUtilsSuite` which checks that rebased micros are the same as the input during DST. The result must be the same if Java 8 and 7 time API functions return the same time zone offsets. - Run the following code to check that there is no difference between rebased and original micros for modern timestamps: ```scala test("rebasing differences") { withDefaultTimeZone(getZoneId("America/Los_Angeles")) { val start = instantToMicros(LocalDateTime.of(1, 1, 1, 0, 0, 0) .atZone(getZoneId("America/Los_Angeles")) .toInstant) val end = instantToMicros(LocalDateTime.of(2030, 1, 1, 0, 0, 0) .atZone(getZoneId("America/Los_Angeles")) .toInstant) var micros = start var diff = Long.MaxValue var counter = 0 while (micros < end) { val rebased = rebaseGregorianToJulianMicros(micros) val curDiff = rebased - micros if (curDiff != diff) { counter += 1 diff = curDiff val ldt = microsToInstant(micros).atZone(getZoneId("America/Los_Angeles")).toLocalDateTime println(s"local date-time = $ldt diff = ${diff / MICROS_PER_MINUTE} minutes") } micros += 30 * MICROS_PER_MINUTE } println(s"counter = $counter") } } ``` ``` local date-time = 0001-01-01T00:00 diff = -2872 minutes local date-time = 0100-03-01T00:00 diff = -1432 minutes local date-time = 0200-03-01T00:00 diff = 7 minutes local date-time = 0300-03-01T00:00 diff = 1447 minutes local date-time = 0500-03-01T00:00 diff = 2887 minutes local date-time = 0600-03-01T00:00 diff = 4327 minutes local date-time = 0700-03-01T00:00 diff = 5767 minutes local date-time = 0900-03-01T00:00 diff = 7207 minutes local date-time = 1000-03-01T00:00 diff = 8647 minutes local date-time = 1100-03-01T00:00 diff = 10087 minutes local date-time = 1300-03-01T00:00 diff = 11527 minutes local date-time = 1400-03-01T00:00 diff = 12967 minutes local date-time = 1500-03-01T00:00 diff = 14407 minutes local date-time = 1582-10-15T00:00 diff = 7 minutes local date-time = 1883-11-18T12:22:58 diff = 0 minutes counter = 15 ``` The code is not added to `DateTimeUtilsSuite` because it takes > 30 seconds. - By running the updated benchmark `DateTimeRebaseBenchmark` via the command: ``` SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.DateTimeRebaseBenchmark" ``` in the environment: | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge | | AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5) | | Java | OpenJDK 1.8.0_242-8u242/11.0.6+10 | Closes #28101 from MaxGekk/fix-local-date-overlap. Lead-authored-by: Maxim Gekk <max.g...@gmail.com> Co-authored-by: Max Gekk <max.g...@gmail.com> Signed-off-by: Wenchen Fan <wenc...@databricks.com> (cherry picked from commit 820bb9985a76a567b79ffb01bbd2d32788e1dba0) Signed-off-by: Wenchen Fan <wenc...@databricks.com> --- .../spark/sql/catalyst/util/DateTimeUtils.scala | 18 ++++- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 16 ++++ .../DateTimeRebaseBenchmark-jdk11-results.txt | 88 +++++++++++----------- .../benchmarks/DateTimeRebaseBenchmark-results.txt | 88 +++++++++++----------- 4 files changed, 120 insertions(+), 90 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala index 44cabe2..3512e3b 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala @@ -997,13 +997,18 @@ object DateTimeUtils { * @return The rebased microseconds since the epoch in Julian calendar. */ def rebaseGregorianToJulianMicros(micros: Long): Long = { - val ldt = microsToInstant(micros).atZone(ZoneId.systemDefault).toLocalDateTime + val instant = microsToInstant(micros) + val zoneId = ZoneId.systemDefault + val ldt = instant.atZone(zoneId).toLocalDateTime val cal = new Calendar.Builder() // `gregory` is a hybrid calendar that supports both // the Julian and Gregorian calendar systems .setCalendarType("gregory") .setDate(ldt.getYear, ldt.getMonthValue - 1, ldt.getDayOfMonth) .setTimeOfDay(ldt.getHour, ldt.getMinute, ldt.getSecond) + // Local time-line can overlaps, such as at an autumn daylight savings cutover. + // This setting selects the original local timestamp mapped to the given `micros`. + .set(Calendar.DST_OFFSET, zoneId.getRules.getDaylightSavings(instant).toMillis.toInt) .build() fromMillis(cal.getTimeInMillis) + ldt.get(ChronoField.MICRO_OF_SECOND) } @@ -1036,7 +1041,16 @@ object DateTimeUtils { cal.get(Calendar.SECOND), (Math.floorMod(micros, MICROS_PER_SECOND) * NANOS_PER_MICROS).toInt) .plusDays(cal.get(Calendar.DAY_OF_MONTH) - 1) - instantToMicros(localDateTime.atZone(ZoneId.systemDefault).toInstant) + val zonedDateTime = localDateTime.atZone(ZoneId.systemDefault) + // Zero DST offset means that local clocks have switched to the winter time already. + // So, clocks go back one hour. We should correct zoned date-time and change + // the zone offset to the later of the two valid offsets at a local time-line overlap. + val adjustedZdt = if (cal.get(Calendar.DST_OFFSET) == 0) { + zonedDateTime.withLaterOffsetAtOverlap() + } else { + zonedDateTime + } + instantToMicros(adjustedZdt.toInstant) } /** diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala index 652abe7..f9c15f3 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala @@ -821,4 +821,20 @@ class DateTimeUtilsSuite extends SparkFunSuite with Matchers with SQLHelper { days += 1 } } + + test("SPARK-31328: rebasing overlapped timestamps during daylight saving time") { + Seq( + LA.getId -> Seq("2019-11-03T08:00:00Z", "2019-11-03T08:30:00Z", "2019-11-03T09:00:00Z"), + "Europe/Amsterdam" -> + Seq("2019-10-27T00:00:00Z", "2019-10-27T00:30:00Z", "2019-10-27T01:00:00Z") + ).foreach { case (tz, ts) => + withDefaultTimeZone(getZoneId(tz)) { + ts.foreach { str => + val micros = instantToMicros(Instant.parse(str)) + assert(rebaseGregorianToJulianMicros(micros) === micros) + assert(rebaseJulianToGregorianMicros(micros) === micros) + } + } + } + } } diff --git a/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt b/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt index 01b0639..570f7be 100644 --- a/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt +++ b/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt @@ -6,49 +6,49 @@ OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1 Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9299 9299 0 10.8 93.0 1.0X -before 1582, noop 9220 9220 0 10.8 92.2 1.0X -after 1582, rebase off 20390 20390 0 4.9 203.9 0.5X -after 1582, rebase on 20378 20378 0 4.9 203.8 0.5X -before 1582, rebase off 20069 20069 0 5.0 200.7 0.5X -before 1582, rebase on 20637 20637 0 4.8 206.4 0.5X +after 1582, noop 9392 9392 0 10.6 93.9 1.0X +before 1582, noop 9324 9324 0 10.7 93.2 1.0X +after 1582, rebase off 20975 20975 0 4.8 209.7 0.4X +after 1582, rebase on 20016 20016 0 5.0 200.2 0.5X +before 1582, rebase off 20088 20088 0 5.0 200.9 0.5X +before 1582, rebase on 20310 20310 0 4.9 203.1 0.5X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 12927 13017 78 7.7 129.3 1.0X -after 1582, vec off, rebase on 13127 13176 50 7.6 131.3 1.0X -after 1582, vec on, rebase off 3725 3779 91 26.8 37.3 3.5X -after 1582, vec on, rebase on 5134 5221 99 19.5 51.3 2.5X -before 1582, vec off, rebase off 13049 13061 16 7.7 130.5 1.0X -before 1582, vec off, rebase on 13877 13916 51 7.2 138.8 0.9X -before 1582, vec on, rebase off 3702 3736 56 27.0 37.0 3.5X -before 1582, vec on, rebase on 5567 5637 78 18.0 55.7 2.3X +after 1582, vec off, rebase off 13371 13463 154 7.5 133.7 1.0X +after 1582, vec off, rebase on 13482 13533 57 7.4 134.8 1.0X +after 1582, vec on, rebase off 3713 3781 96 26.9 37.1 3.6X +after 1582, vec on, rebase on 5153 5173 29 19.4 51.5 2.6X +before 1582, vec off, rebase off 12939 12998 97 7.7 129.4 1.0X +before 1582, vec off, rebase on 14160 14255 85 7.1 141.6 0.9X +before 1582, vec on, rebase off 3748 3776 28 26.7 37.5 3.6X +before 1582, vec on, rebase on 5532 5575 54 18.1 55.3 2.4X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 2988 2988 0 33.5 29.9 1.0X -before 1582, noop 3000 3000 0 33.3 30.0 1.0X -after 1582, rebase off 16163 16163 0 6.2 161.6 0.2X -after 1582, rebase on 68399 68399 0 1.5 684.0 0.0X -before 1582, rebase off 16921 16921 0 5.9 169.2 0.2X -before 1582, rebase on 74425 74425 0 1.3 744.3 0.0X +after 1582, noop 2795 2795 0 35.8 27.9 1.0X +before 1582, noop 2806 2806 0 35.6 28.1 1.0X +after 1582, rebase off 16113 16113 0 6.2 161.1 0.2X +after 1582, rebase on 70198 70198 0 1.4 702.0 0.0X +before 1582, rebase off 16690 16690 0 6.0 166.9 0.2X +before 1582, rebase on 75706 75706 0 1.3 757.1 0.0X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 15147 15258 97 6.6 151.5 1.0X -after 1582, vec off, rebase on 45035 45101 60 2.2 450.3 0.3X -after 1582, vec on, rebase off 4934 5012 100 20.3 49.3 3.1X -after 1582, vec on, rebase on 34263 34360 88 2.9 342.6 0.4X -before 1582, vec off, rebase off 15177 15220 37 6.6 151.8 1.0X -before 1582, vec off, rebase on 46754 46761 12 2.1 467.5 0.3X -before 1582, vec on, rebase off 4892 4956 61 20.4 48.9 3.1X -before 1582, vec on, rebase on 35989 36014 22 2.8 359.9 0.4X +after 1582, vec off, rebase off 15631 15753 111 6.4 156.3 1.0X +after 1582, vec off, rebase on 45834 46027 193 2.2 458.3 0.3X +after 1582, vec on, rebase off 4883 4964 70 20.5 48.8 3.2X +after 1582, vec on, rebase on 34514 34563 63 2.9 345.1 0.5X +before 1582, vec off, rebase off 15253 15354 104 6.6 152.5 1.0X +before 1582, vec off, rebase on 47353 47412 59 2.1 473.5 0.3X +before 1582, vec on, rebase off 4848 4894 69 20.6 48.5 3.2X +before 1582, vec on, rebase on 36125 36143 22 2.8 361.3 0.4X ================================================================================================ @@ -59,36 +59,36 @@ OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1 Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9295 9295 0 10.8 93.0 1.0X -before 1582, noop 9352 9352 0 10.7 93.5 1.0X -after 1582 17112 17112 0 5.8 171.1 0.5X -before 1582 17979 17979 0 5.6 179.8 0.5X +after 1582, noop 9160 9160 0 10.9 91.6 1.0X +before 1582, noop 9235 9235 0 10.8 92.4 1.0X +after 1582 17154 17154 0 5.8 171.5 0.5X +before 1582 17545 17545 0 5.7 175.5 0.5X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off 20874 20905 38 4.8 208.7 1.0X -after 1582, vec on 3813 3844 28 26.2 38.1 5.5X -before 1582, vec off 25912 25949 38 3.9 259.1 0.8X -before 1582, vec on 4322 4343 19 23.1 43.2 4.8X +after 1582, vec off 21024 21146 205 4.8 210.2 1.0X +after 1582, vec on 3814 3838 21 26.2 38.1 5.5X +before 1582, vec off 24293 24347 82 4.1 242.9 0.9X +before 1582, vec on 4143 4168 22 24.1 41.4 5.1X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 3003 3003 0 33.3 30.0 1.0X -before 1582, noop 3012 3012 0 33.2 30.1 1.0X -after 1582 41031 41031 0 2.4 410.3 0.1X -before 1582 44436 44436 0 2.3 444.4 0.1X +after 1582, noop 2797 2797 0 35.8 28.0 1.0X +before 1582, noop 2826 2826 0 35.4 28.3 1.0X +after 1582 40021 40021 0 2.5 400.2 0.1X +before 1582 41500 41500 0 2.4 415.0 0.1X OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off 28477 28582 92 3.5 284.8 1.0X -after 1582, vec on 20754 20924 237 4.8 207.5 1.4X -before 1582, vec off 32858 32921 58 3.0 328.6 0.9X -before 1582, vec on 25734 25769 30 3.9 257.3 1.1X +after 1582, vec off 32517 32541 23 3.1 325.2 1.0X +after 1582, vec on 19644 19725 128 5.1 196.4 1.7X +before 1582, vec off 37204 37305 104 2.7 372.0 0.9X +before 1582, vec on 24105 24120 13 4.1 241.1 1.3X diff --git a/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt b/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt index b353013..2f8712c 100644 --- a/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt +++ b/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt @@ -6,49 +6,49 @@ OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0- Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9691 9691 0 10.3 96.9 1.0X -before 1582, noop 9024 9024 0 11.1 90.2 1.1X -after 1582, rebase off 21195 21195 0 4.7 211.9 0.5X -after 1582, rebase on 20045 20045 0 5.0 200.4 0.5X -before 1582, rebase off 20039 20039 0 5.0 200.4 0.5X -before 1582, rebase on 20451 20451 0 4.9 204.5 0.5X +after 1582, noop 9488 9488 0 10.5 94.9 1.0X +before 1582, noop 9301 9301 0 10.8 93.0 1.0X +after 1582, rebase off 20109 20109 0 5.0 201.1 0.5X +after 1582, rebase on 20004 20004 0 5.0 200.0 0.5X +before 1582, rebase off 19906 19906 0 5.0 199.1 0.5X +before 1582, rebase on 20466 20466 0 4.9 204.7 0.5X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 13207 13339 116 7.6 132.1 1.0X -after 1582, vec off, rebase on 13408 13446 57 7.5 134.1 1.0X -after 1582, vec on, rebase off 3680 3712 39 27.2 36.8 3.6X -after 1582, vec on, rebase on 5229 5261 29 19.1 52.3 2.5X -before 1582, vec off, rebase off 13135 13164 25 7.6 131.4 1.0X -before 1582, vec off, rebase on 13946 14033 94 7.2 139.5 0.9X -before 1582, vec on, rebase off 3689 3726 49 27.1 36.9 3.6X -before 1582, vec on, rebase on 5679 5687 9 17.6 56.8 2.3X +after 1582, vec off, rebase off 12593 12653 52 7.9 125.9 1.0X +after 1582, vec off, rebase on 13350 13489 121 7.5 133.5 0.9X +after 1582, vec on, rebase off 3665 3681 25 27.3 36.6 3.4X +after 1582, vec on, rebase on 5193 5210 16 19.3 51.9 2.4X +before 1582, vec off, rebase off 13023 13059 32 7.7 130.2 1.0X +before 1582, vec off, rebase on 13855 13937 115 7.2 138.6 0.9X +before 1582, vec on, rebase off 3651 3665 12 27.4 36.5 3.4X +before 1582, vec on, rebase on 5623 5671 45 17.8 56.2 2.2X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 2720 2720 0 36.8 27.2 1.0X -before 1582, noop 2712 2712 0 36.9 27.1 1.0X -after 1582, rebase off 16626 16626 0 6.0 166.3 0.2X -after 1582, rebase on 85136 85136 0 1.2 851.4 0.0X -before 1582, rebase off 16855 16855 0 5.9 168.6 0.2X -before 1582, rebase on 106121 106121 0 0.9 1061.2 0.0X +after 1582, noop 2798 2798 0 35.7 28.0 1.0X +before 1582, noop 2955 2955 0 33.8 29.6 0.9X +after 1582, rebase off 15889 15889 0 6.3 158.9 0.2X +after 1582, rebase on 84247 84247 0 1.2 842.5 0.0X +before 1582, rebase off 16134 16134 0 6.2 161.3 0.2X +before 1582, rebase on 100006 100006 0 1.0 1000.1 0.0X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 15198 15301 90 6.6 152.0 1.0X -after 1582, vec off, rebase on 55210 55370 140 1.8 552.1 0.3X -after 1582, vec on, rebase off 4859 4880 19 20.6 48.6 3.1X -after 1582, vec on, rebase on 44758 44824 85 2.2 447.6 0.3X -before 1582, vec off, rebase off 15206 15316 112 6.6 152.1 1.0X -before 1582, vec off, rebase on 60452 60588 222 1.7 604.5 0.3X -before 1582, vec on, rebase off 4892 4933 36 20.4 48.9 3.1X -before 1582, vec on, rebase on 46871 46950 82 2.1 468.7 0.3X +after 1582, vec off, rebase off 14920 15045 116 6.7 149.2 1.0X +after 1582, vec off, rebase on 55062 55171 140 1.8 550.6 0.3X +after 1582, vec on, rebase off 4871 4952 72 20.5 48.7 3.1X +after 1582, vec on, rebase on 44955 44981 23 2.2 449.5 0.3X +before 1582, vec off, rebase off 15236 15386 142 6.6 152.4 1.0X +before 1582, vec off, rebase on 57290 57368 79 1.7 572.9 0.3X +before 1582, vec on, rebase off 4919 4930 15 20.3 49.2 3.0X +before 1582, vec on, rebase on 47351 47713 400 2.1 473.5 0.3X ================================================================================================ @@ -59,36 +59,36 @@ OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0- Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9102 9102 0 11.0 91.0 1.0X -before 1582, noop 9099 9099 0 11.0 91.0 1.0X -after 1582 17652 17652 0 5.7 176.5 0.5X -before 1582 18284 18284 0 5.5 182.8 0.5X +after 1582, noop 9451 9451 0 10.6 94.5 1.0X +before 1582, noop 9765 9765 0 10.2 97.7 1.0X +after 1582 18722 18722 0 5.3 187.2 0.5X +before 1582 18864 18864 0 5.3 188.6 0.5X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off 25169 25215 48 4.0 251.7 1.0X -after 1582, vec on 3701 3717 16 27.0 37.0 6.8X -before 1582, vec off 26919 27045 182 3.7 269.2 0.9X -before 1582, vec on 4169 4192 31 24.0 41.7 6.0X +after 1582, vec off 24897 25095 247 4.0 249.0 1.0X +after 1582, vec on 3719 3780 84 26.9 37.2 6.7X +before 1582, vec off 31290 31347 50 3.2 312.9 0.8X +before 1582, vec on 4166 4188 25 24.0 41.7 6.0X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 2906 2906 0 34.4 29.1 1.0X -before 1582, noop 2863 2863 0 34.9 28.6 1.0X -after 1582 48858 48858 0 2.0 488.6 0.1X -before 1582 50945 50945 0 2.0 509.5 0.1X +after 1582, noop 2882 2882 0 34.7 28.8 1.0X +before 1582, noop 2991 2991 0 33.4 29.9 1.0X +after 1582 53951 53951 0 1.9 539.5 0.1X +before 1582 54276 54276 0 1.8 542.8 0.1X OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off 40925 40955 26 2.4 409.2 1.0X -after 1582, vec on 31246 31404 164 3.2 312.5 1.3X -before 1582, vec off 44634 44680 40 2.2 446.3 0.9X -before 1582, vec on 35578 35834 282 2.8 355.8 1.2X +after 1582, vec off 41411 41514 97 2.4 414.1 1.0X +after 1582, vec on 32163 32201 36 3.1 321.6 1.3X +before 1582, vec off 43013 43111 131 2.3 430.1 1.0X +before 1582, vec on 34114 34152 45 2.9 341.1 1.2X --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org