This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push: new e7885b8 [SPARK-31297][SQL] Speed up dates rebasing e7885b8 is described below commit e7885b8a6686bc9179f741f1394dbbf7a9e211ef Author: Maxim Gekk <max.g...@gmail.com> AuthorDate: Tue Mar 31 17:38:47 2020 +0800 [SPARK-31297][SQL] Speed up dates rebasing ### What changes were proposed in this pull request? In the PR, I propose to replace current implementation of the `rebaseGregorianToJulianDays()` and `rebaseJulianToGregorianDays()` functions in `DateTimeUtils` by new one which is based on the fact that difference between Proleptic Gregorian and the hybrid (Julian+Gregorian) calendars was changed only 14 times for entire supported range of valid dates `[0001-01-01, 9999-12-31]`: | date | Proleptic Greg. days | Hybrid (Julian+Greg) days | diff| | ---- | ----|----|----| |0001-01-01|-719162|-719164|-2| |0100-03-01|-682944|-682945|-1| |0200-03-01|-646420|-646420|0| |0300-03-01|-609896|-609895|1| |0500-03-01|-536847|-536845|2| |0600-03-01|-500323|-500320|3| |0700-03-01|-463799|-463795|4| |0900-03-01|-390750|-390745|5| |1000-03-01|-354226|-354220|6| |1100-03-01|-317702|-317695|7| |1300-03-01|-244653|-244645|8| |1400-03-01|-208129|-208120|9| |1500-03-01|-171605|-171595|10| |1582-10-15|-141427|-141427|0| For the given days since the epoch, the proposed implementation finds the range of days which the input days belongs to, and adds the diff in days between calendars to the input. The result is rebased days since the epoch in the target calendar. For example, if need to rebase -650000 days from Proleptic Gregorian calendar to the hybrid calendar. In that case, the input falls to the bucket [-682944, -646420), the diff associated with the range is -1. To get the rebased days in Julian calendar, we should add -1 to -650000, and the result is -650001. ### Why are the changes needed? To make dates rebasing faster. ### Does this PR introduce any user-facing change? No, the results should be the same for valid range of the `DATE` type `[0001-01-01, 9999-12-31]`. ### How was this patch tested? - Added 2 tests to `DateTimeUtilsSuite` for the `rebaseGregorianToJulianDays()` and `rebaseJulianToGregorianDays()` functions. The tests check that results of old and new implementation (optimized version) are the same for all supported dates. - Re-run `DateTimeRebaseBenchmark` on: | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge | | AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5) | | Java | OpenJDK8/11 | Closes #28067 from MaxGekk/optimize-rebasing. Lead-authored-by: Maxim Gekk <max.g...@gmail.com> Co-authored-by: Max Gekk <max.g...@gmail.com> Signed-off-by: Wenchen Fan <wenc...@databricks.com> (cherry picked from commit bb0b416f0b3a2747a420b17d1bf659891bae3274) Signed-off-by: Wenchen Fan <wenc...@databricks.com> --- .../spark/sql/catalyst/util/DateTimeUtils.scala | 79 +++++++++++++++------- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 58 +++++++++++++++- .../DateTimeRebaseBenchmark-jdk11-results.txt | 64 +++++++++--------- .../benchmarks/DateTimeRebaseBenchmark-results.txt | 64 +++++++++--------- 4 files changed, 174 insertions(+), 91 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala index 2b646cc..44cabe2 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala @@ -1040,6 +1040,44 @@ object DateTimeUtils { } /** + * Rebases days since the epoch from an original to an target calendar, from instance + * from a hybrid (Julian + Gregorian) to Proleptic Gregorian calendar. + * + * It finds the latest switch day which is less than `days`, and adds the difference + * in days associated with the switch day to the given `days`. The function is based + * on linear search which starts from the most recent switch days. This allows to perform + * less comparisons for modern dates. + * + * @param switchDays The days when difference in days between original and target + * calendar was changed. + * @param diffs The differences in days between calendars. + * @param days The number of days since the epoch 1970-01-01 to be rebased to the + * target calendar. + * @return The rebased day + */ + private def rebaseDays(switchDays: Array[Int], diffs: Array[Int], days: Int): Int = { + var i = switchDays.length - 1 + while (i >= 0 && days < switchDays(i)) { + i -= 1 + } + val rebased = days + diffs(if (i < 0) 0 else i) + rebased + } + + // The differences in days between Julian and Proleptic Gregorian dates. + // The diff at the index `i` is applicable for all days in the date interval: + // [julianGregDiffSwitchDay(i), julianGregDiffSwitchDay(i+1)) + private val julianGregDiffs = Array(2, 1, 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10, 0) + // The sorted days in Julian calendar when difference in days between Julian and + // Proleptic Gregorian calendars was changed. + // The starting point is the `0001-01-01` (-719164 days since the epoch in + // Julian calendar). All dates before the staring point have the same difference + // of 2 days in Julian and Proleptic Gregorian calendars. + private val julianGregDiffSwitchDay = Array( + -719164, -682945, -646420, -609895, -536845, -500320, -463795, + -390745, -354220, -317695, -244645, -208120, -171595, -141427) + + /** * Converts the given number of days since the epoch day 1970-01-01 to * a local date in Julian calendar, interprets the result as a local * date in Proleptic Gregorian calendar, and take the number of days @@ -1049,25 +1087,22 @@ object DateTimeUtils { * @return The rebased number of days in Gregorian calendar. */ def rebaseJulianToGregorianDays(days: Int): Int = { - val utcCal = new Calendar.Builder() - // `gregory` is a hybrid calendar that supports both - // the Julian and Gregorian calendar systems - .setCalendarType("gregory") - .setTimeZone(TimeZoneUTC) - .setInstant(Math.multiplyExact(days, MILLIS_PER_DAY)) - .build() - val localDate = LocalDate.of( - utcCal.get(Calendar.YEAR), - utcCal.get(Calendar.MONTH) + 1, - // The number of days will be added later to handle non-existing - // Julian dates in Proleptic Gregorian calendar. - // For example, 1000-02-29 exists in Julian calendar because 1000 - // is a leap year but it is not a leap year in Gregorian calendar. - 1) - .plusDays(utcCal.get(Calendar.DAY_OF_MONTH) - 1) - Math.toIntExact(localDate.toEpochDay) + rebaseDays(julianGregDiffSwitchDay, julianGregDiffs, days) } + // The differences in days between Proleptic Gregorian and Julian dates. + // The diff at the index `i` is applicable for all days in the date interval: + // [gregJulianDiffSwitchDay(i), gregJulianDiffSwitchDay(i+1)) + private val grepJulianDiffs = Array(-2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 0) + // The sorted days in Proleptic Gregorian calendar when difference in days between + // Proleptic Gregorian and Julian was changed. + // The starting point is the `0001-01-01` (-719162 days since the epoch in + // Proleptic Gregorian calendar). All dates before the staring point have the same + // difference of -2 days in Proleptic Gregorian and Julian calendars. + private val gregJulianDiffSwitchDay = Array( + -719162, -682944, -646420, -609896, -536847, -500323, -463799, + -390750, -354226, -317702, -244653, -208129, -171605, -141427) + /** * Rebasing days since the epoch to store the same number of days * as by Spark 2.4 and earlier versions. Spark 3.0 switched to @@ -1085,14 +1120,6 @@ object DateTimeUtils { * @return The rebased number of days since the epoch in Julian calendar. */ def rebaseGregorianToJulianDays(days: Int): Int = { - val localDate = LocalDate.ofEpochDay(days) - val utcCal = new Calendar.Builder() - // `gregory` is a hybrid calendar that supports both - // the Julian and Gregorian calendar systems - .setCalendarType("gregory") - .setTimeZone(TimeZoneUTC) - .setDate(localDate.getYear, localDate.getMonthValue - 1, localDate.getDayOfMonth) - .build() - Math.toIntExact(Math.floorDiv(utcCal.getTimeInMillis, MILLIS_PER_DAY)) + rebaseDays(gregJulianDiffSwitchDay, grepJulianDiffs, days) } } diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala index 922e09d..652abe7 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala @@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst.util import java.sql.{Date, Timestamp} import java.text.SimpleDateFormat import java.time.{Instant, LocalDate, LocalDateTime, LocalTime, ZoneId} -import java.util.{Locale, TimeZone} +import java.util.{Calendar, Locale, TimeZone} import java.util.concurrent.TimeUnit import org.scalatest.Matchers @@ -765,4 +765,60 @@ class DateTimeUtilsSuite extends SparkFunSuite with Matchers with SQLHelper { } } } + + test("optimization of days rebasing - Gregorian to Julian") { + def refRebaseGregorianToJulianDays(days: Int): Int = { + val localDate = LocalDate.ofEpochDay(days) + val utcCal = new Calendar.Builder() + // `gregory` is a hybrid calendar that supports both + // the Julian and Gregorian calendar systems + .setCalendarType("gregory") + .setTimeZone(TimeZoneUTC) + .setDate(localDate.getYear, localDate.getMonthValue - 1, localDate.getDayOfMonth) + .build() + Math.toIntExact(Math.floorDiv(utcCal.getTimeInMillis, MILLIS_PER_DAY)) + } + + val start = localDateToDays(LocalDate.of(1, 1, 1)) + val end = localDateToDays(LocalDate.of(2030, 1, 1)) + + var days = start + while (days < end) { + assert(rebaseGregorianToJulianDays(days) === refRebaseGregorianToJulianDays(days)) + days += 1 + } + } + + test("optimization of days rebasing - Julian to Gregorian") { + def refRebaseJulianToGregorianDays(days: Int): Int = { + val utcCal = new Calendar.Builder() + // `gregory` is a hybrid calendar that supports both + // the Julian and Gregorian calendar systems + .setCalendarType("gregory") + .setTimeZone(TimeZoneUTC) + .setInstant(Math.multiplyExact(days, MILLIS_PER_DAY)) + .build() + val localDate = LocalDate.of( + utcCal.get(Calendar.YEAR), + utcCal.get(Calendar.MONTH) + 1, + // The number of days will be added later to handle non-existing + // Julian dates in Proleptic Gregorian calendar. + // For example, 1000-02-29 exists in Julian calendar because 1000 + // is a leap year but it is not a leap year in Gregorian calendar. + 1) + .plusDays(utcCal.get(Calendar.DAY_OF_MONTH) - 1) + Math.toIntExact(localDate.toEpochDay) + } + + val start = rebaseGregorianToJulianDays( + localDateToDays(LocalDate.of(1, 1, 1))) + val end = rebaseGregorianToJulianDays( + localDateToDays(LocalDate.of(2030, 1, 1))) + + var days = start + while (days < end) { + assert(rebaseJulianToGregorianDays(days) === refRebaseJulianToGregorianDays(days)) + days += 1 + } + } } diff --git a/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt b/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt index 52522f8..4fed511 100644 --- a/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt +++ b/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt @@ -2,52 +2,52 @@ Rebasing dates/timestamps in Parquet datasource ================================================================================================ -OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9272 9272 0 10.8 92.7 1.0X -before 1582, noop 9142 9142 0 10.9 91.4 1.0X -after 1582, rebase off 21841 21841 0 4.6 218.4 0.4X -after 1582, rebase on 58245 58245 0 1.7 582.4 0.2X -before 1582, rebase off 19813 19813 0 5.0 198.1 0.5X -before 1582, rebase on 63737 63737 0 1.6 637.4 0.1X +after 1582, noop 9304 9304 0 10.7 93.0 1.0X +before 1582, noop 9187 9187 0 10.9 91.9 1.0X +after 1582, rebase off 22054 22054 0 4.5 220.5 0.4X +after 1582, rebase on 20361 20361 0 4.9 203.6 0.5X +before 1582, rebase off 20286 20286 0 4.9 202.9 0.5X +before 1582, rebase on 22230 22230 0 4.5 222.3 0.4X -OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 13004 13063 67 7.7 130.0 1.0X -after 1582, vec off, rebase on 36224 36253 26 2.8 362.2 0.4X -after 1582, vec on, rebase off 3596 3654 54 27.8 36.0 3.6X -after 1582, vec on, rebase on 26144 26253 112 3.8 261.4 0.5X -before 1582, vec off, rebase off 12872 12914 51 7.8 128.7 1.0X -before 1582, vec off, rebase on 37762 37904 153 2.6 377.6 0.3X -before 1582, vec on, rebase off 3522 3592 94 28.4 35.2 3.7X -before 1582, vec on, rebase on 27580 27615 59 3.6 275.8 0.5X +after 1582, vec off, rebase off 12773 12866 129 7.8 127.7 1.0X +after 1582, vec off, rebase on 13063 13086 39 7.7 130.6 1.0X +after 1582, vec on, rebase off 3678 3719 61 27.2 36.8 3.5X +after 1582, vec on, rebase on 5078 5121 52 19.7 50.8 2.5X +before 1582, vec off, rebase off 12942 12972 42 7.7 129.4 1.0X +before 1582, vec off, rebase on 13866 13904 58 7.2 138.7 0.9X +before 1582, vec on, rebase off 3678 3711 43 27.2 36.8 3.5X +before 1582, vec on, rebase on 5621 5657 44 17.8 56.2 2.3X -OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 3113 3113 0 32.1 31.1 1.0X -before 1582, noop 3078 3078 0 32.5 30.8 1.0X -after 1582, rebase off 15749 15749 0 6.3 157.5 0.2X -after 1582, rebase on 69106 69106 0 1.4 691.1 0.0X -before 1582, rebase off 15967 15967 0 6.3 159.7 0.2X -before 1582, rebase on 76843 76843 0 1.3 768.4 0.0X +after 1582, noop 2983 2983 0 33.5 29.8 1.0X +before 1582, noop 2979 2979 0 33.6 29.8 1.0X +after 1582, rebase off 17452 17452 0 5.7 174.5 0.2X +after 1582, rebase on 70193 70193 0 1.4 701.9 0.0X +before 1582, rebase off 17784 17784 0 5.6 177.8 0.2X +before 1582, rebase on 83498 83498 0 1.2 835.0 0.0X -OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 15070 15172 94 6.6 150.7 1.0X -after 1582, vec off, rebase on 43748 43867 157 2.3 437.5 0.3X -after 1582, vec on, rebase off 4805 4859 60 20.8 48.1 3.1X -after 1582, vec on, rebase on 33960 34027 61 2.9 339.6 0.4X -before 1582, vec off, rebase off 15037 15071 52 6.7 150.4 1.0X -before 1582, vec off, rebase on 44590 44749 156 2.2 445.9 0.3X -before 1582, vec on, rebase off 4831 4852 30 20.7 48.3 3.1X -before 1582, vec on, rebase on 35460 35481 18 2.8 354.6 0.4X +after 1582, vec off, rebase off 15114 15151 32 6.6 151.1 1.0X +after 1582, vec off, rebase on 45804 45912 126 2.2 458.0 0.3X +after 1582, vec on, rebase off 4900 4947 56 20.4 49.0 3.1X +after 1582, vec on, rebase on 34599 34650 45 2.9 346.0 0.4X +before 1582, vec off, rebase off 15093 15174 70 6.6 150.9 1.0X +before 1582, vec off, rebase on 47367 47472 121 2.1 473.7 0.3X +before 1582, vec on, rebase off 4884 4952 80 20.5 48.8 3.1X +before 1582, vec on, rebase on 35831 35883 59 2.8 358.3 0.4X diff --git a/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt b/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt index c9320cf..ee48627 100644 --- a/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt +++ b/sql/core/benchmarks/DateTimeRebaseBenchmark-results.txt @@ -2,52 +2,52 @@ Rebasing dates/timestamps in Parquet datasource ================================================================================================ -OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save dates to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 9472 9472 0 10.6 94.7 1.0X -before 1582, noop 9226 9226 0 10.8 92.3 1.0X -after 1582, rebase off 21201 21201 0 4.7 212.0 0.4X -after 1582, rebase on 56471 56471 0 1.8 564.7 0.2X -before 1582, rebase off 20179 20179 0 5.0 201.8 0.5X -before 1582, rebase on 65717 65717 0 1.5 657.2 0.1X +after 1582, noop 9582 9582 0 10.4 95.8 1.0X +before 1582, noop 9473 9473 0 10.6 94.7 1.0X +after 1582, rebase off 21431 21431 0 4.7 214.3 0.4X +after 1582, rebase on 22156 22156 0 4.5 221.6 0.4X +before 1582, rebase off 21399 21399 0 4.7 214.0 0.4X +before 1582, rebase on 22927 22927 0 4.4 229.3 0.4X -OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load dates from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 12294 12434 205 8.1 122.9 1.0X -after 1582, vec off, rebase on 36959 36967 12 2.7 369.6 0.3X -after 1582, vec on, rebase off 3644 3691 49 27.4 36.4 3.4X -after 1582, vec on, rebase on 26764 26852 92 3.7 267.6 0.5X -before 1582, vec off, rebase off 12830 12917 85 7.8 128.3 1.0X -before 1582, vec off, rebase on 38897 39053 229 2.6 389.0 0.3X -before 1582, vec on, rebase off 3638 3693 85 27.5 36.4 3.4X -before 1582, vec on, rebase on 28956 29007 44 3.5 289.6 0.4X +after 1582, vec off, rebase off 12637 12736 111 7.9 126.4 1.0X +after 1582, vec off, rebase on 13463 13531 61 7.4 134.6 0.9X +after 1582, vec on, rebase off 3693 3703 8 27.1 36.9 3.4X +after 1582, vec on, rebase on 5242 5252 9 19.1 52.4 2.4X +before 1582, vec off, rebase off 13055 13169 126 7.7 130.5 1.0X +before 1582, vec off, rebase on 14067 14270 185 7.1 140.7 0.9X +before 1582, vec on, rebase off 3697 3702 7 27.1 37.0 3.4X +before 1582, vec on, rebase on 6058 6097 34 16.5 60.6 2.1X -OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Save timestamps to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, noop 2952 2952 0 33.9 29.5 1.0X -before 1582, noop 2880 2880 0 34.7 28.8 1.0X -after 1582, rebase off 15928 15928 0 6.3 159.3 0.2X -after 1582, rebase on 82816 82816 0 1.2 828.2 0.0X -before 1582, rebase off 15988 15988 0 6.3 159.9 0.2X -before 1582, rebase on 92636 92636 0 1.1 926.4 0.0X +after 1582, noop 2713 2713 0 36.9 27.1 1.0X +before 1582, noop 2715 2715 0 36.8 27.2 1.0X +after 1582, rebase off 16768 16768 0 6.0 167.7 0.2X +after 1582, rebase on 82811 82811 0 1.2 828.1 0.0X +before 1582, rebase off 17052 17052 0 5.9 170.5 0.2X +before 1582, rebase on 95134 95134 0 1.1 951.3 0.0X -OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1058-aws +OpenJDK 64-Bit Server VM 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08 on Linux 4.15.0-1063-aws Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz Load timestamps from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ -after 1582, vec off, rebase off 14863 14917 78 6.7 148.6 1.0X -after 1582, vec off, rebase on 54819 54939 140 1.8 548.2 0.3X -after 1582, vec on, rebase off 4905 4941 32 20.4 49.0 3.0X -after 1582, vec on, rebase on 44914 45008 124 2.2 449.1 0.3X -before 1582, vec off, rebase off 14928 14970 48 6.7 149.3 1.0X -before 1582, vec off, rebase on 59752 59996 245 1.7 597.5 0.2X -before 1582, vec on, rebase off 4892 4916 33 20.4 48.9 3.0X -before 1582, vec on, rebase on 46854 46977 198 2.1 468.5 0.3X +after 1582, vec off, rebase off 15200 15321 194 6.6 152.0 1.0X +after 1582, vec off, rebase on 63160 63337 177 1.6 631.6 0.2X +after 1582, vec on, rebase off 4891 4928 43 20.4 48.9 3.1X +after 1582, vec on, rebase on 45474 45484 10 2.2 454.7 0.3X +before 1582, vec off, rebase off 15203 15330 110 6.6 152.0 1.0X +before 1582, vec off, rebase on 65588 65664 73 1.5 655.9 0.2X +before 1582, vec on, rebase off 4844 4916 105 20.6 48.4 3.1X +before 1582, vec on, rebase on 47815 47943 162 2.1 478.2 0.3X --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org