This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 6f7c719  [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on 
non-existing dates/timestamps
6f7c719 is described below

commit 6f7c71947073f147bc35da196139d5ceb6fbdf45
Author: Max Gekk <max.g...@gmail.com>
AuthorDate: Sun May 10 14:22:12 2020 -0500

    [SPARK-31669][SQL][TESTS] Fix RowEncoderSuite failures on non-existing 
dates/timestamps
    
    ### What changes were proposed in this pull request?
    Shift non-existing dates in Proleptic Gregorian calendar by 1 day. The 
reason for that is `RowEncoderSuite` generates random dates/timestamps in the 
hybrid calendar, and some dates/timestamps don't exist in Proleptic Gregorian 
calendar like 1000-02-29 because 1000 is not leap year in Proleptic Gregorian 
calendar.
    
    ### Why are the changes needed?
    This makes RowEncoderSuite much stable.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    By running RowEncoderSuite and set non-existing date manually:
    ```scala
    val date = new java.sql.Date(1000 - 1900, 1, 29)
    Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
    ```
    
    Closes #28486 from MaxGekk/fix-RowEncoderSuite.
    
    Authored-by: Max Gekk <max.g...@gmail.com>
    Signed-off-by: Sean Owen <sro...@gmail.com>
    (cherry picked from commit 9f768fa9916dec3cc695e3f28ec77148d81d335f)
    Signed-off-by: Sean Owen <sro...@gmail.com>
---
 .../org/apache/spark/sql/RandomDataGenerator.scala | 23 +++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
index a7c20c3..5a4d23d 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala
@@ -18,9 +18,10 @@
 package org.apache.spark.sql
 
 import java.math.MathContext
+import java.sql.{Date, Timestamp}
 
 import scala.collection.mutable
-import scala.util.Random
+import scala.util.{Random, Try}
 
 import org.apache.spark.sql.catalyst.CatalystTypeConverters
 import org.apache.spark.sql.catalyst.util.DateTimeConstants.MILLIS_PER_DAY
@@ -172,7 +173,15 @@ object RandomDataGenerator {
               // January 1, 1970, 00:00:00 GMT for "9999-12-31 
23:59:59.999999".
               milliseconds = rand.nextLong() % 253402329599999L
             }
-            DateTimeUtils.toJavaDate((milliseconds / MILLIS_PER_DAY).toInt)
+            val date = DateTimeUtils.toJavaDate((milliseconds / 
MILLIS_PER_DAY).toInt)
+            // The generated `date` is based on the hybrid calendar Julian + 
Gregorian since
+            // 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
+            // by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `date` to
+            // a local date in Proleptic Gregorian calendar to satisfy this 
requirement.
+            // Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
+            // As the consequence of that, 29 February of such years might not 
exist in Proleptic
+            // Gregorian calendar. When this happens, we shift the date by one 
day.
+            Try { date.toLocalDate; date }.getOrElse(new Date(date.getTime + 
MILLIS_PER_DAY))
           }
         Some(generator)
       case TimestampType =>
@@ -188,7 +197,15 @@ object RandomDataGenerator {
               milliseconds = rand.nextLong() % 253402329599999L
             }
             // DateTimeUtils.toJavaTimestamp takes microsecond.
-            DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+            val ts = DateTimeUtils.toJavaTimestamp(milliseconds * 1000)
+            // The generated `ts` is based on the hybrid calendar Julian + 
Gregorian since
+            // 1582-10-15 but it should be valid in Proleptic Gregorian 
calendar too which is used
+            // by Spark SQL since version 3.0 (see SPARK-26651). We try to 
convert `ts` to
+            // a local timestamp in Proleptic Gregorian calendar to satisfy 
this requirement.
+            // Some years are leap years in Julian calendar but not in 
Proleptic Gregorian calendar.
+            // As the consequence of that, 29 February of such years might not 
exist in Proleptic
+            // Gregorian calendar. When this happens, we shift the timestamp 
`ts` by one day.
+            Try { ts.toLocalDateTime; ts }.getOrElse(new Timestamp(ts.getTime 
+ MILLIS_PER_DAY))
           }
         Some(generator)
       case CalendarIntervalType => Some(() => {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to